JOURNAL 


OF SPEECH 
AND HEARING 


RESEARCH 


VOLUME 1, NUMBER 2 JUNE, 1958 





A Journal of the 


AMERICAN SPEECH AND HEARING. ASSOCIATION 








Manuscripts and related correspondence should be addressed to Dorothy Sherman, Editor, 
Journal of Speech and Hearing Research, ay ig of Speech Pathology and Audiology, 
East Hall, State University of Iowa, Iowa City, Iowa. 


Susscriptions and orders for back numbers should be addressed to Kenneth O. Johnson, 
Business Manager, Journal of Speech and Hearing Research, American Speech and Hear- 
ing Association, 1001 Connecticut Avenue N. W., Washington 6, D.C. 

& 


Notice to AUTHORS 


Before submitting maar for publication authors should consult Information for 


Contributors to the Journal of peech and Hearing Research, JSHR, 1, 1958, 94-96. Pros- 
pective authors are invited to write the Editor for copies of this Note, and for its supple- 
ment, Examples of Manuscript Form. It is assumed that authors keep carbon copies of 
their manuscripts since the JSHR cannot assume responsibility for loss. No manuscript 
can be considered for publication in the JSHR which has appeared elsewhere. 


The Association reserves the right to keep all copies of any manuscript submitted to it. 


Notice To MemBers 


Member authors of ASHA are requested to provide autographed copies of their bound 
publications for a library in the National Office. Books should be sent to Kenneth O. 
Johnson, Executive Secretary, American Speech and Hearing Association, 1001 Con- 
necticut Avenue, N.W., Washington 6, D. C. 





The Journal of Speech and Hearing Research is published by the American Speech 
and Hearing Association in March, June, September and December of each year. 


Anyone has permission to reproduce a portion of any article in this issue of the 
Journal if permission is also obtained from the author. Quotations must be accurate, and 
credit must be given to the author or authors. Credit must be given to the Journal of 
Speech and Hearing Research by bibliographic reference or other suitable form. 


Subscription: $5.00 per year in U.S.A.; $5.50 per year outside US.A.; $1.35 per single 
copy. Entry pending as second-class matter at the Post Office in Danville, Illinois, under 
Act of Congress of March 3, 1879. Printed by The Interstate Printers and Publishers, Inc., 
19-27 North Jackson Street, Danville, Illinois. Office of Publication, Jackson at Van Buren, 
Danville, Illinois. 


AN a ead idk ance aa on 











a 





JOURNAL OF 


SPEECH AND HEARING 
RESEARCH 


Volume 1 June 1958 Number 2 


Some Properties of the Glottal Sound Source 
JAMES L. FLANAGAN 


Frequency-Intensity Relationships and Optimum Pitch Level 
WAYNE 1. THURMAN 


Listener Evaluations of Speech Interruptions 
DEAN E. WILLIAMS AND LOUISE R. KENT 


Listener Responses to Non-Fluencigs 
RICHARD M. BOEHMLER 


Two Voice-Message Storage Schemes 


PAUL O. THOMPSON, JOHN C. WEBSTER, ROY G. KLUMPP, AND 
WALTER F. BERTSCH 


Some Variables Affecting Perceived Harshness 
MARYJANE REES 


Objective Speech Audiometry: A New Method Based on Electrodermal Response 
HOWARD B. RUHM AND RAYMOND CARHART 
¢ 


Stapedolysis (Stapes Mobilization) and the Nomograph Technic 
VICTOR GOODHILL 


Graduate Theses in Speech and Hearing Research — 1956 
FRANKLIN H. KNOWER 


Book Review 








THE AMERICAN SPEECH AND HEARING ASSOCIATION 


OFFICERS 
President Jon Eisenson, Ph.D., Queens College 
Executive Vice-President Stanley Ainsworth, Ph.D., Univ. of Georgia 
Vice-President Leo G. Doerfler, Ph.D., Univ. of Pittsburgh 
Editor of the Association Robert West, Ph.D., Brooklyn College 
OFFICERS-ELECT 
President-Elect George A. Kopp, Ph.D., Wayne State Univ. 
Executive Vice-President-Elect Jack Matthews, Ph.D., University of Pittsburgh 
Vice-President-Elect Miriam D. Pauls, Ph.D., Johns Hopkins Med. 


Instit. 
Editor of the Association-Elect Wendell Johnson, Ph.D., Univ. of Iowa 


COUNCIL 
The Officers and the following Councilors: 


Raymond Carhart, Ph.D. Hayes A. Newby, Ph.D., (1957-59) 
Ernest H. Henrikson, Ph.D., (1956-58) Wilbert L. Provonost, Ph.D., (1958-60) 
Ira J. Hirsch, Ph.D., (1958-60) Sylvia O. Richardson, M.D., (1957-59) 
Ruth B. Irwin, Ph.D., (1956-59) Charlotte G. Wells, Ph.D., (1956-58) 
Herold S. Lillywhite, Ph.D., (1958-59) 

EXECUTIVE SECRETARY 


Kenneth O. Johnson, Ph.D. 


APPLICATIONS FOR MEMBERSHIP SHOULD BE ADDRESSED TO THE EXECUTIVE SECRETARY 





THE JOURNAL OF SPEECH AND HEARING RESEARCH 


EDITOR 
Dorothy Sherman, Ph.D. 


ASSISTANT TO THE EDITOR 
Jean B. Kern, Ph.D. 


ASSISTANT EDITOR 
Frederic L. Darley, Ph.D. 


STATISTICAL CONSULTANT 
Leonard S. Feldt, Ph.D. 


ASSOCIATE EDITORS 
Oliver Bloodstein, Ph.D. 
Arthur S. House, Ph.D. 
James F. Jerger, Ph.D. 
D. E. Morley, Ph.D. 
Hildred Schuell, Ph.D. 
William R. Tiffany, Ph.D. 
John C. Webster, Ph.D. 
Joseph M. Wepman, Ph.D. 


DEPARTMENT EDITORS 
Ernest H. Henrikson, Ph.D. Book Reviews 
Martin F. Palmer, Sc.D. Records 


EDITOR OF THE ASSOCIATION 
Robert West, Ph.D. 


BUSINESS MANAGER 
Kenneth O. Johnson, Ph.D. 


























Some Properties Of The Glottal Sound Source 


James L. Flanagan 


Recent researches (2, 8, 9, 12, 13) 
have demonstrated the usefulness of 
electrical analog techniques in study- 
ing the physics of speech production. 
Related techniques also have been 
found rewarding in the electrical syn- 
thesis of speech (3) and in schemes 
for reducing the channel capacity 
necessary to transmit speech (5). 
Basic to such an approach is an elec- 
trical system whose transmission 
properties are similar to the acous- 
tical properties of the vocal tract. In 
the analog, electrical voltages usually 
are made analogous to sound pres- 
sures and electrical currents analogous 
to acoustic volume velocities. The 
electrical elements of the analog are 
manipulated to simulate the config- 
urations of the vocal system. 
Having derived an electrical model 
that possesses the appropriate trans- 
mission properties (this usually is 
done from measurements on X-rays 
of the vocal tract), it is of consider- 
able importance to know how the 
analog should be excited. Ideally the 
excitation of the model should be ex- 
actly analogous to the excitation of 
the human vocal tract, and this, of 
course, requires a knowledge of the 
properties of the vocal excitation. 
Acoustic measurements on the sources 





James L. Flanagan (Sc.D., Massachusetts 
Institute of Technology, 1955) is Member 
of the Technical Staff, Bell Telephone Lab- 
oratories, Murray Hill, New Jersey. 


of excitation within the vocal tract, 
however, are very difficult to make. 
Consequently, the excitation of ex- 
isting analogs usually has been ar- 
rived at largely on empirical grounds. 
Such determination frequently leaves 
something to be desired since it often 
is difficult to localize accurately cer- 
tain inadequacies in the simulation. 
An attempt has been made, therefore, 
to determine quantitatively some of 
the properties of the vocal excitation 
for voiced sounds. 

To establish a frame of reference 
for the dimensions that must be con- 
sidered, a highly schematized diagram 
of the lower respiratory system of an 
adult male is shown in Figure 1. The 
diagram essentially represents a me- 
dian-plane section, perpendicular to 
an anterior-posterior line, through 
the glottis and trachea. 

[The voiced sounds of speech are 
produced by exciting the tract_with 
the glottal or vocal fold source.| The 
vibrating folds act as a variable valve 
that allows quasi-periodic pulses of 
air to escape from the lungs into the 
tract. These air ‘puffs’ produce an im- 
pulsive excitation of the acoustical 
conduit constituting the vocal tract. 
If it were possible to determine 
the waveform of the volume flow 
through the glottis, and the equivalent 
acoustic impedance of the glottis, 
then it should be possible to make an 
electrical source which generates a 
current similar to the glottal volume 


——— 99 





100 JOURNAL OF SPEECH AND HEARING RESEARCH 





fe- 


~~ LUNGS 7 “L-7% 
4-SL MAX. 


Figure 1. Simplified schematic diagram of 
lower respiratory system of adult male. 


flow, and which has an electrical im- 
pedance analogous to the acoustic 
impedance of the glottal source. Be- 
cause the phase spectrum of speech 
usually is of secondary importance 
to intelligibility, the next best analo- 
gous source would be a periodic one 
whose amplitude spectrum (but not 











Y} Y 
Y Uy 
ANY 4 
VT (Wj 5 
ZING 
Y Y 
Zo G 
Y Y 
MZ) 3 














Figure 2. Idealization of the glottis as an 
orifice separating two relatively large tubes. 


necessarily waveform) has the same 
shape as that of the glottal volume 
velocity function, and whose inter- 
nal impedance is of an appropriate 
value. It is of consequence, there- 
fore, to establish some of the prop- 
erties of the glottal volume flow, its 
amplitude spectrum, and the acoustic 
impedance of the glottis/ 

To a first approximation the glot- 
tis might be considered as an orifice 
separating two relatively large tubes 
(corresponding crudely to the trachea 
and vocal tract, respectively) as 
shown in Figure 2. As a start, the 
simple relations tor steady incompres- 
sible flow through such an orifice 
might be considered. 


Steady Flow Through An Orifice 


Let the pressures on the subglottic 
and supraglottic sides of the oritice be 
P, and P., respectively. Let the par- 
ticle velocity of the air in the orifice 
be U, and the area of the orifice be A. 
If the cross-sectional areas of the 
adjacent tubes are much larger than 
A, variations in the pressures P, and P, 
caused by flow through the orifice are 
very small, and P, and P, can be as- 
sumed to remain sensibly constant. 
Also, if the velocity of air in the tubes 
is very small compared with the 
velocity in the orifice, the kinetic 
energy of the air in the tubes may be 
neglected. Further, if the dimensions 
of the orifice are small compared with 
the wavelength of an acoustic dis- 
turbance in the medium, and if the 
mean flow velocity is much smaller 
than the speed of sound, an acoustic 
disturbance is known essentially in- 
stantaneously throughout the vicinity 
of the orifice. In such a case it is 
reasonably valid to assume conditions 
of incompressibility. In addition, let 








eo 








— 


a 


it be assumed that the distribution of 
velocity is uniform over the orifice 
and that there is no viscous dissipa- 
tion. Under these assumptions, the 
kinetic energy per-unit-volume pos- 
sessed by the air in the orifice is de- 
veloped by the pressure difference 
(P,-P,), and is 


(P:-P:) = pU?/2, (1) 


where p is the density of the air. Re- 
writing (1) yields for the particle 
velocity 


= [2 -(P-Ps)/ pl”. (2) 


If the resistance of the orifice, R, is 
defined as the ratio of pressure drop 
(P,-P,) to volume flow (UA), then 


R= pU/2A:.= pO/2A*. (3) 


where Q = UDA is the volume ve- 
locity. 

In practical situations, consider- 
able variation actually exists in the 
distribution of particle velocity over 
the orifice, and the assumption of uni- 
form velocity distribution is not par- 
ticularly good. To determine the 
volume discharge at the orifice pre- 
cisely it is necessary to integrate the 
particle velocity over the orifice area, 
and this requires a knowledge of the 
velocity profile. However, the con- 
verging flow establishes a contraction 
of the jet at a short distance down- 
stream at the so-called vena contracta, 
and at this cross-section the stream- 
lines are essentially straight and paral- 
lel, and the velocity distribution uni- 
form. If A. jand U, are the area and 
particle velocity at the vena contracta, 
then the volume flow is A.U.. If A 
and A, are related by an empirically 
determined contraction coefficient, 
5,, so that 


FLANAGAN: GLOTTAL SOUND SOURCE 10! 


Ae = a5, (4) 


and if U. ~ U, then (3) might be 
modified and written more accurately 
as 


R = pQ/2 (8-A)?. (5) 


In a similar vein, the pressure-to- 
kinetic energy conversion in the 
orifice can never be accomplished 
without some viscous loss, and the 
particle velocity is actually some- 
what less than that given by (2). If 
a simple relation of the form (5) is 
to be used to approximate the resist- 
ance of an orifice, it is customary 
practice to introduce an empirically 
determined velocity coefficient, 6,, 
to account for viscous loss, so that 
(2) is modified to 


U = & [2 (P.-P:)/p]"”, (6) 
and (5) is revised to 


A. 
2(88-A)? 2 (8A)?’ 





ee (7) 


where 6 = 6, 6, is an empirical flow 
coefficient for the orifice. 

The coefficient 5 is dependent 
upon Reynolds’ number, but for 
orifices of dimensions comparable to 
the glottis, and for flows reasonably 
representative of vocal flows, 5 is of 
the order of 0.8. 

If the area of the orifice and -the 
velocity of flow are sufficiently 
small, the discharge actually may be 
governed by viscous laws rather than 
by the kinetic energy of the issuing 
particles. An expression such as (7), 
therefore, may not be very precise 
for small flow velocity and orifice 
area. A formula for orifice resistance, 
valid also for small values of velocity 
and area, should include a term to 








102 JOURNAL OF SPEECH AND HEARING RESEARCH 


a 
G 


L=18 MM 


MS... 
| 





Figure 3. Model of human larynx used in 
steady flow measurements. After van den 
Berg et al. (15). 


account for the ‘low velocity’ resist- 
ance. To a first approximation, the 
expression might have a form that is 
some linear combination of kinetic 
and viscous terms, such as 


R = R. + k (pQ/2A?%), (8) 


where R, is a viscous resistance and 
k is a constant. For steady laminar 
flow in smooth tubes, the Hagen- 
Poiseuille law shows R, to be propor- 
tional to the kinematic coefficient of 
viscosity and the length of the con- 
ducting passage, and inversely pro- 
portional to a function of the cross- 
sectional area. 


Flow Resistance of the 
Glottal Orifice 


Both Wegel (16) and van den Berg 
et al. (15) have made steady flow 
measurements on models of the 
human larynx, and have obtained 
empirical formulas for the resistance 
of the glottal orifice. The two ex- 
perimentally derived formulas, while 
differing in form, yield results that 
are essentially similar. The measure- 
ments of van den Berg et al., how- 
ever, are the more extensive and were 


made on an actual plaster cast model 
of a normal larynx. In van den Berg’s 
model the glottis was idealized as a 
rectangular slit whose length was 
maintained constant at 18 mm. 
Changes in the area of the orifice 
were made by changing the width, 
w, of the slit, as shown in Figure 3. 
Van den Berg gives the resistance of 
the glottis to steady flow as 


12uL pQ 


a iw + 0.875 SP? 





(9) 


where yp is the coefficient of viscosity, 
L is the thickness of the glottis and 
1 is the length of the glottal slit. Van 
den Berg says that expression (9) 
holds well for 0.1 = w S 2.0 mm, 
for subglottic pressures up to 64 cm 
H.O at small widths, and for volume 
velocities up to 2000 cm*/sec at large 
widths. 

Since R is the ratio of subglottic 
pressure to glottal volume flow, it is 
possible, if values of subglottic pres- 
sure and glottal area are given, to de- 
duce from (9) corresponding values 
of glottal volume flow for steady flow 
conditions. The question now arises 
as to how precisely expression (9), 
or the more approximate expression 
(7), holds in the case of non-steady 
flow, occasioned primarily by changes 
in the glottal area with time. It is 
apparent that if the inertial and com- 
pliant effects of the medium could 
be neglected, the foregoing relations 
also could be applied in the time- 
varying situation. If the orifice were 
made to execute some periodic varia- 
tion in area, it might be asked at what 
frequency does the mass reactance of 
the fluid passing the orifice become 
appreciable. To get a very rough idea 
of this value, the case may be con- 
sidered where the area is made to 
execute a step function, so that 





ee ee 


SSS Sy te ens 








tt 
SS 





0, 0 
AD =o So (10) 


An estimate can be made of the 
time required for the flow to build 
up to the steady-state value when a 
constant pressure difference is main- 
tained across the orifice. Referring to 
the simple diagram in Figure 2, the 
mass of the air plug in the orifice is 
assumed to be pAL,, where L, = 
(L + 0.8 Y A) is the effective thick- 
ness of the orifice taking into 
account an ‘end effect’ for the air 
plug. It is further assumed that 
the pressure difference acts to ac- 
celerate the air plug only over the 
distance L,. The equation of motion 
for the plug is, therefore, 


aU 
iis — may =« (1) 





From (10), U (t) = 0 for t=0, and 
the solution for (11) is 


"7. t, (12) 


which essentially is valid only for 
positive values of ¢ near to zero. An 
estimate of the rise-time of the flow 
may be made by assuming U(t) to in- 
crease linearly until the ideal steady- 
state value U, = [2(P,-P.)/p]'/”? is 
reached. The time required to ac- 
complish this build-up is, therefore, 


2Le 


4b a (13) 





This time is somewhat less than the 
actual rise-time since the fluid accel- 
eration does not remain constant, but 
drops off as the steady-state velocity 
is reached. Even so, T serves as a 
reasonable measure of minimum rise- 
time. 


FLANAGAN: GLOTTAL SOUND SOURCE _ 103 


It might be assumed, somewhat 
arbitrarily, that an orifice area varia- 
tion whose period is at least of an 
order of magnitude greater than T 
is sufficiently slow so that the fluctua- 
tions in volume flow may be con- 
sidered as a series of consecutively 
established steady states. If such an as- 
sumption is tenable, then the previous 
relations for steady flow may be ap- 
plied in cases where the variations in 
area take place in times greater than 
10T. This says, in other words, that 
the acoustic mass reactance of the 
orifice is relatively small compared 
with its resistance, and the ‘static’ re- 
lations between orifice area, pres- 
sure differential, and volume flow 
may be used just as the static char- 
acteristic curves for a vacuum tube 
are used in dynamic applications. An 
additional piece of experimental evi- 
dence lends confidence to the fore- 
going assumption. Westervelt and 
McAuliffe (17) have found that the 
acoustic mass reactance of an orifice 
in the presence of a ‘de’ through flow 
is decreased to about 1/3 of its value 
in the absence of through flow for 
particle velocities in excess of about 
300 cm/sec. This means that the 
acoustic inertance of the orifice is 
appreciably less than pL,/A in the 
presence of such through flow. 


Expression (13) can be evaluated 
in terms of typical glottal dimensions 
and subglottic pressure. The follow- 
ing values may be taken as typical 
for an adult male during normal con- 
versational speech: the thickness (L) 
of the glottal orifice is approximately 
3 mm; the mean glottal area is ap- 
proximately 6mm*, the subglottic 
pressure (P,-P,) is of the order of 8 
cm H,O. Using these values in (13) 
yields 


T = 0.3 msec, 
and 1/10T = 330 cps. (14) 





104 JOURNAL OF SPEECH AND HEARING RESEARCH 


For such conditions and for varia- 
tions in glottal area of frequency less 
than about 300 cps, it appears reason- 
able to consider the glottal flow as a 
series of consecutively established 
steady states, and to evaluate the time- 
varying glottal resistance from the 
steady flow relation (9), or from the 
more approximate relation (7). It 
also can be seen from (13) that T is 
inversely proportional to the square 
root of the subglottic pressure. The 
value of 1/10T in (14), therefore, 
would be increased to about 470 cps if 
the subglottic pressure were doubled, 
and decreased to about 230 cps if the 
pressure were halved. Happily, the 
subglottic pressure and the minimum 
produceable sound pressure level are 
essentially monotonically increasing 
functions of vocal pitch. 

Proceeding under the assumption 
that the rationale for applying the 
steady flow relations to the time- 
varying case is sufficiently valid, re- 
lation (9) shows that for small glot- 
tal areas, where the flow is con- 
trolled mainly by the viscous term, 
Q is essentially proportional to 
(P,-P,)A*. Similarly, for large glot- 
tal openings, where the flow is de- 
termined chiefly by the kinetic term, 
Q is essentially proportional to 
(P,-P,)?A. Van den Berg has 
pointed out that the leading and trail- 
ing edges of the volume velocity 
waveform are more steep than those 
of the area function, and it is to be 
expected that the glottis will generate 
more intense higher harmonics than 
those computed from an harmonic 
analysis of A(t). One facet of the 
problem under consideration is the 
relative importance of this ‘sharpen- 
ing’ effect. 

Since it is desired to determine the 
volume velocity waveform, given the 
glottal area function and the sub- 





10,000 oe TT TTT on 


T TTT 


\y 
°o 
° 
° 
T 
eae ae 


a | 
pois 


8 
oO 
T 


Folate tiarey 


ny 
pet fd LH 
ae ae ey | Pe en 


to eens 
bist 


0.2 


1 











Te oe Pe | 
O14 0.2 05 10 2.0 $0 10 20 50 100 
GLOTTAL AREA IN MM@ 


0.1 pitil a We Oe ae 





Ficure 4. Relation between glottal volume 
velocity and glottal area for different values 
of subglottic pressure. 


glottic pressure, van den Berg’s for- 
mula (9) has been used to make a 
plot of Q vs. A, with P, = (P,-P,) 
as a parameter. This plot is shown in 
Figure 4. It is now proposed that 
Figure 4 be used to deduce wave- 
forms of glottal volume velocity from 
existing data on glottal area and sub- 
glottic pressure, and that Fourier 
analyses be made of the area and 
volume velocity functions to obtain 
their amplitude spectra. 


Subglottic Pressure During 
Phonation 


Several investigators have made 
measurements of mean _ subglottic 


pressure during phonation, and have 








EE 


See 











———— 





FLANAGAN: GLOTTAL SOUND SOURCE 105 


TaBLE 1. Relations between pitch, sound pressure level and mean subglottic pressure, adapted 
and averaged from data reported by van den Berg (14). 











Pitch (eps) Liminal spi (db)* Subglottic Pressure (em H2O)t 
db +5 db +10 db +15 db 
97 56 4 7 9 —- 
145 57 6 i. 10 -= 
218 62 9 11 14 19 
274 67 12 21 26 29 








*Liminal sound pressure level measured 25 cm in front of the mouth; db re 0.0002 dyne/cm?. 
{Sound pressure level relative to liminal sound pressure level. 


reported data in the literature (J, 7, 
10). Most of the data, however, have 
been obtained on subjects who could 
not phonate normally, for example, 
subjects with fistulas in the trachea. 
Probably the best measurements to 
date on a normal subject have been 
made by van den Berg (14) using 
both a direct and an indirect tech- 
nique in which catheters were in- 
serted in the glottis and esophagus. 
Van den Berg made measurements of 
subglottic pressure over an intensity 
range beginning with the lowest in- 
tensity that the subject could sustain 
(liminal intensity) and increasing in 
five db steps to the loudest level at 
which the subject could phonate. The 
pitch range covered in the measure- 
ments began with the lowest-pitched 
chest voice and ranged upward to 
falsetto. Van den Berg’s data have 
been taken, therefore, and averaged 
to obtain representative values of sub- 
glottic pressure as a function of in- 
tensity and pitch for a male voice. 
These values, over the ranges of in- 
terest, are given in Table 1. 

In the earlier discussion, it was 
assumed that subglottic pressure re- 
mains essentially constant within the 
duration of an appreciable number of 
vocal periods. Since the trachea and 
vocal tract are not infinitely large 


reservoirs, and since they are coupled 
acoustically, the subglottic pressure 
is subject to variation about its mean 
value. Van den Berg et al. (15) have 
made calculations of the magnitude of 
this variation and estimate it to be 
less than five per cent of the mean 
subglottic pressure. A qualitative no- 
tion of the relative magnitudes of 
mean subglottic pressure and acoustic 
sound pressure in the tract can be 
had by realizing that the peak acoustic 
pressure at the mouth of a speaker 
producing a very loud sound usually 
is less than about 10? dynes /cm’, 
while a typical mean subglottic pres- 
sure is of the order of 10* dynes/cm?. 
For first-order approximation, there- 
fore, the assumption of constant sub- 
glottic pressure during the production 
of voiced sounds seems reasonably 
valid and realistic, and this assump- 
tion will be made in using Figure 4 
and Table 1 to relate area and volume 
flow. 


Glottal Area and Volume Flow 
During Phonation 


Several years ago a technique was 
developed at the Bell Laboratories for 
making high-speed motion pictures 
of the vibrations of the vocal folds 
(4). This technique has since been 












































































































rc. te ef 
= suBJECT: 8-1 ~]°°° y 
ae Famces | So 
= 16F P= 4m H,0 14° =, 
> 
, a, | rat 3 
z ate) 
< 12+ 4300 89 
id J mT) 
r rT] 
< Wor 
2 8F 200 7 
< = = 
a 2 5 
5 | 
9 ab 100 6 
oa L 4} J 4 
Oecd 1 (ted Gam Mee Wee) 
° 1 2 6 7 8 9 
TIME IN MILLISECONDS 
26 TT 009 
L SUBJECT: 8-1 . 
~~ 20F Tas Fo=ut cPS = +500 w 
= i Deda P,=7CM H,0 Ms 
A ‘ > 
Zz 16 7 \ 4400 © 
< / \Wevocity ze: z 
Ww ’ ‘ 
@ 12 HI ‘ 1300 > 
< { bE 
5 AREA\! 4 oO 
J y ‘ Ps 
aot {200 5 
Ss wu 
° — 4 > 
4 w 
3 ab 100 : 
tf \ 4 3 
° ee aes ee WE Ta Tee at SN Ee GE! ee ie) g 
fe) ' 2 3 a 5 6 7 8 9 
TIME IN MILLISECONDS 
aT ett ttt te hk eh ct 5 
SUBJECT .B-M - Fa 
20 it Fo= 222 cPS 500 
7 ~ n 
) , at P,= 9¢mMH,0 i 
> 4 ‘\ wi 
716 f ‘ aoo0 > 
z ui s % 
= ae VELOCITY} ‘ > 
< 1 ~ be 
@ tae ! \ 300 2 
q i > 
et = U \ ae 
4 U \ o 
F eb ‘ area \ 42006 
° / ‘ =: 
3 L 4 ‘ 4 Ln 
¥ 4 . w 
ak Z 4!00 5 
7 wg 2 
7) eT ae Mae Te owe De er ee ess 
Oo OS HO Tee 2o: 8s. 00 SS” 40" “as 
TIME IN MILLISECONDS 
oY pee: ry aes ae SO Te OT a ORL la 
Z SUBJECT: B- Z 
« 
S 6 F,=250 CPS 800 9 
= Ps=2acmH20 | 4 
: i 
<l2 +600 a 
@ fe. 
< rs) 
78 400 2 
er > 
6 4 rE 
J Vv 
O04 200 0 
a 
4 8 
A EEN ist Cae ae ae ° 
° 0s 1.0 LS 2.0 25 3.0 3.5 4.0 


106 JOURNAL OF SPEECH AND HEARING RESEARCH 
o 
20+ SUBJECT’ A-I +4500) 
wee Fo = 125 CPS = 
= P,=4¢CmM H20 
516 . . 400& 
“ 

: 3 
< 
wie 300 2 
. > 
z 8 5 

& 2003 
tC re) 

° =) 
J w 
a 4 100 7 

w 
= 
° r oS 
0 ‘ 4 5 6 ae 3 
TIME IN MILLISECONDS 

28 va @ T T T n T T T T T T T T 700 

a , SUBJECT: A-1 9 
‘ 

=i / a fo=125 CPS 1600 © 
} \ R=8cmMH20 uw 
ou / \ d 
= 20+ { {500 & 

r 4 s / 4 ” 
- b3 
aw 16 ~j400 
« Zz 
< - 

: ‘¢ 
< 12 4300 5 
- c 
° wu 
o 8 +4200 ~ 

w 
e 53 
4e +100 Oo 
> 
VA 3 
ola 1 0 
ie) 1 2 3 4 5 6 7 8 
TIME IN MILLISECONDS 
a 
, ie) ‘ams Me Ta a a ae a ae ee ee FA 
20F SUBJECT: A-II 4500 “4 
F,= 250 CPS a 
i r eno Ps =100M H20 Ff 

= '6r vetocity ‘ 0010 
z a ‘ ‘\ J a 
¢ Ki \ 3 
Gt2- ad ‘ {300 5 
« £ 
< > 
rj rE 
<* 8 
bt S 
° w 
a4 4 

Ww 
= 
fe) Recent 1 oe ! Lol 1 Loci 1 1 Lica 0 2 
5s. 10 NS eo. eS oe, SS ae oS 
TIME IN MILLISECONDS 

T , ae T T T T T T ¥ T —. a | T ° 
24 os. SUBJECT! A-IZ -j1200 2 
/ * ° 
5 / \ Fo=235cPS 4 “ 
¥ 20h i \ P.=20CMH20 _|io99” 
3 ! ‘ ry 
d H WELOCITY = Ne 
= 6+ H \ soo 3 
o 5 y AREA\, : z 
« | \ md 
< 42+ H 4 {600 F 
ay. \ afr Bap 
é g 
5 6b 400 @ 
J > 
.e) 5 4 w 
4b 200 2 
2 
Y = re) 
ff B \, > 

O° ee ee ee es eS a ee 1 le) 

oO W 15 20 25 3D 35 40 


TIME IN MILLISECONDS 


Figure 5. Typical cycles of the glottal area 
and volume velocity functions for male 
subject A phonating under four different 
conditions of pitch (F.) and intensity (see 
Table 2). 


TIME IN MILLISECONDS 


Ficure 6. Typical cycles of the glottal area 
and volume velocity functions for male sub- 
ject 3 phonating under four different con- 
ditions of pitch (F.) and intensity (see 
Table 2). 





i a aitieemaainta 


————— eee 


Se 








Lg TS 


————— 


es 


ee ae 





FLANAGAN: GLOTTAL SOUND SOURCE — 107 


TaBLe 2. Pitch and intensity data, adapted from Fletcher (6),£for two adult males phonating the 
vowel [#] under four different conditions: (I) at the lowest pitch and intensity possible, consistent 
with sustainingjthe vowel sound; (II) at the same pitch as (I), but with the intensity increased to 
the highest level possible without altering the vowel; (III) at the lowest intensity possible but with 
the pitch approximately one octave above (I); and (IV) at the same pitch as (III) and at the highest 


intensity possible without altering the vowel. 











Subject Pitch (cps) Intensity Difference (db) 
I II III IV II-I IV-II 
A 125 125 250 235 7 16 
B lll 111 22% 5 5 17 








used by a number of investigators, 
and a substantial amount of data is 
available in the literature on the area 
of the glottal port during voicing. 
Among these works is a comprehen- 
sive study of internal laryngeal ac- 
tivity by W. W. Fletcher (6). Be- 
cause of his measurements of sound 
pressure levels concomitant with the 
photography, his glottal area data 
have been used to deduce waveforms 
of volume velocity. 

Fletcher took high-speed pictures 
(4000 frames/sec) for three normal 
subjects: two male and one female. 
The subjects were instructed to pro- 
duce the vowel [2] under four dif- 
ferent conditions: (1) at the lowest 
pitch and intensity possible, consistent 
with sustaining the vowel sound; (II) 
at the same pitch as (I), but with the 
intensity increased to the highest level 
possible without altering the vowel; 


TaBLE 3. Estimated subglottic pressures for 
the four conditions (I, II, III, and IV as in 
Table 2) under which male subjects A and B 
phonated. 





Subject Subglottic Pressure (em. H20) 
I II III IV 








A 4 8 10 20 
B 4 7 9 24 








(IIL) at the lowest intensity possible 
but with the pitch approximately one 
octave above (I); and (IV) at the 
same pitch as (Ill) and at the highest 
intensity possible without altering the 
vowel. Measurements were made of 
the intensity differences between con- 
ditions (1) and (IL) and between (III) 
and (IV). The values of pitch and 
intensity difference for Fletcher’s 
male subjects (A and B) are shown 
in Table 2. To determine continuous 
functions representing typical cycles 
of the glottal area for these con- 
ditions, smooth curves with continuity 
of slope were drawn through the 
data points." 


Using the data in Table 1, values 
were estimated for the subglottic 
pressures corresponding to the con- 
ditions under which Fletcher’s male 
subjects phonated. These estimated 
values are recorded in Table 3. 


With the pressure data of Table 3, 


'Fletcher’s data constitute samples of glot- 
tal area obtained at a rate of 4000 times/ 
sec. For the present analysis continuous area 
functions are desired. The argument for 
drawing smooth curves with continuity of 
slope through the area samples is as follows: 
the vocal folds are massive elements; the 
laryngeal muscles can exert only a finite 
force; therefore, the acceleration of the 
folds must be finite, and their velocity and 
displacement must be continuous functions 
of time. 





108 . JOURNAL OF SPEECH AND HEARING RESEARCH 


TaBLe 4.‘ Effective sampling rates for harmonic 
analysis of the glottal area and volume velocity 
functions for the four conditions (I, II, III, and 
IV as in Table 2) under which male subjects 
A and B phonated. 








Samples per second 


II III IV 


Subject 





A 9000 9000 18000 16900 
B 8000 8000 16000 18000 








the glottal area functions from 
Fletcher, and the relation plotted in 
Figure 4, corresponding cycles of 
glottal volume flow were determined 
for subjects A and B (male). The 
plotted area and velocity functions 
for these two subjects are shown 
in Figure 5 (a-d) and Figure 6 (a-d). 
The area functions for the female 
subject also were plotted but are not 
reproduced here. The volume flow 
functions for this subject were not 
determined because data on subglottic 
pressure were not available for the 
female. 


It is apparent from these plots that 
the leading and trailing edges of the 
velocity function are slightly more 
steep than those of the area function. 
As already pointed out, the volume 
velocity is approximately propor- 
tional to the third power of the area 
for small values of glottal area, and 


is more nearly proportional to the 
first power of the area for large glot- 
tal openings.\ It also may be noted 
that in most of the low intensity cases 
complete closure of the glottis is not 
attained, while in the higher intensity 
cases complete closure usually is at- 
tained. | 

An-IBM 650 digital computer was 
programmed to make a Fourier series 
analysis of each of the area and ve- 
locity functions. In the analysis each 
function was sampled at every five 
degrees of the argument, resulting in 
effective sampling rates for the vari- 
ous conditions as shown in Table 4. 
The amplitudes and phases of 35 har- 
monic components were computed 
for each function, and the calculated 
series reproduced the original func- 
tions within the maximum errors 
shown in Table 5. The harmonic am- 
plitude spectrum for each function 
was computed in terms of db relative 
to the amplitude of the fundamental 
component. The computed amplitude 
spectra are plotted in Figure 7 (a-d) 
and Figure 8 (a-d). 
(In most of the cases, the spectral 
graphs show that in terms of db the 
spectra for the area and volume ve- 
locity functions do not differ greatly; 
Tae ‘sharpening’ effect that tends to 
make the higher harmonics of the ve- 
locity function more intense than 


TaBie 5. Maximum errors in the computed series for the area and velocity functions. The four con- 
ditions (I, II, III and IV as in Table 2) are those under which male subjects A and B phonated. 











Subject I 








II IIT IV 
Area Function (mm?) 
A 0.007 0.008 0.011 0.003 
B 0.036 0.004 0.010 C.0001 
Volume Velocity Function (cm*/sec) 
A 0.18 0.36 0.21 1.27 
B 0.23 0.33 0.41 0.08 











ee 


SS ieee ee ae 








I 





























T oe ae ee T rt creel} 
ore 
SUBJECT A-I 
& Fo=125CPS 4 
Ps=4cm H20 
an 0h 
4 @ AREA 
a 
D a ry) © VELOCITY 4 
v 
rd 
3 
z ~20F 4 
8 
3 LL e J 
2 
= 
a -30} e 8 J 
> g & 
< 
g eS 4 
> « 
= ° 
% -40- °8 4 
a e 0° 
a ee 
- eoo 4 
eo o% 
-50 8 o we 4 
° 
oe 
° 
5 e 6% 
e ° 
-60 1 ea es ea ae eS ee es 
100 200 500 1000 2000 5000 ~—10,000 
FREQUENCY IN CYCLES PER SECOND 
T ott tet te T ror tert 
as e SUBJECT: A-IT 4 
Fy = 250 CPS 
- P, = 100M H20 7 
-10b @ AREA at 
© VELOCITY 
3 8 
a - a 
oe 
rd 
w-20- 4 
2 
= a al 
Ff - & 
= ° 
F -30b ° 4 
z ) 
2 3E Gites 4 
$ ° 
@-40F “ % 7 
a si * © ° ° 
wd 
« a o* 8 o& &- 
° . * 7 & &° 
56 . i ba 
oP ~ 
L +. Rea 
+ . 
-60 1 a oe ee 1! Ls 1816 
100 200 500 1000 $000 10000 


2000 
FREQUENCY IN CYCLES PER SECOND 


FLANAGAN: GLOTTAL SOUND SOURCE _ 109 





























T ror or re rey T cae se iat 2 
ore SUBJECT: A- 7 
Fo =125 CPS 
Ps=8CmM H20 
bs “10k e AREA - 
¥ © VELOCITY 
ee : 
ra) e 
Z -20- 4 
ww 
S$ 
5 é 4 
r g 
2 es 
-30- m 
$ 30 . 
< oe 
> q "0 6 7 
< 
< ~40} 4 
a ° 
« 
o 
50k a 4 
° e*.@ 
5 x Q B 0 4 
-60 L oa ses | 1 Prise es 
100 200 $00 1000 2000 $000” ~ 10,000 
FREQUENCY IN CYCLES PER SECOND 
oe e 
“10+ 
3 + 4 
Oo ~20F ° 4 
* o «6 
z * ° 
w - . 4 
ral 
> 
= o30- +4 
5 73 
: ea . J 
° ° 
< 740r 4 
j © 
u ) 
x a o. .o- 
ie al oa 
%© e 
-s9- «.* ane) 
e ; lace 
. ad hd . 
aieet Ee 
J ’ . ‘ 
. ry 
-60 lelt 
100 ° 5000 10,000 





Ficure 7. Harmonic amplitude spectra for the glottal area and volume velocity functions 


shown in Figure 5 (Subject A). 


those of the area function is, for the 
most part, only slightly discernible. 
This is particularly true for the low 
intensity, low pitch cases. It is less 
true for the high intensity, high pitch 
cases where the disparity between 
the area and velocity functions is, in 
general, more noticeable. Considering 
the effectiveness of low frequency 
components in masking higher ones, 
and considering also the spectral dis- 


tributions of energy for vocalic 
sounds, the differences between the 
area and velocity spectra probably 
are not greatly significant in terms 
of perception. To the extent that 
prior assumptions are valid, and bar- 
ring the cases of extremely high pitch 
and intensity, the spectra indicate (as 
have already the time functions in 
Figures 5 and 6) that the area function 





110 JOURNAL OF SPEECH AND HEARING RESEARCH 























T es as Me a T oe hae a I 
ofe 4 
SUBJECT: B-1 
ce Fo= 11 CPS J 
a Py= 4M H20 
-10- 4 
” @ AREA 
rr g © VELOCITY J 
4 
re 
Oo -20- 8 | 
Zz 
yw = 
8 = 
5 eye 
& 730 0° Be 7 
i :. = 
ry 
° 4 
eT é te 
& % 
< 
-40r ° = 
Z oe Pe 6 
« es e 
5 es ° 259 4 
= Se 
SOF ° ate 4 
ee? 
“60 1 RAE CY BE Weer | l eae ee 
100 200 500 1000 2000 $000 10,000 
FREQUENCY IN CYCLES PER SECOND 
T T ae i esa | T T . ae ae 
or bd SUBYECT: B- I “| 
fy =222cPs 
R=9 cm H20 a 
-10 ® AREA 4 
” © VELOCITY 
a 
o 
rw) 
w 
© -20} ° 
2 8 . 
we + 
re) 
e ” 
J =30- 
a ° 
$ 8 
. ° 
2 ii 
% 
< -40;- 0% 
a e 
« 
r ° 
SOF 
-60 L i a ad 1 
100 200 $00 1000 2000 


FREQUENCY IN CYCLES PER SECOND 


























T a “cg Oa a LY | T Ror ac cee 
Ore SUBJECT’ 8-1 
R= ces 
z z =7 CM H20 7 
“10 @ AREA 4 
4 © veLociTy 
“pee i 
re ry) 
w 
° 
Oo -20- co 4 
2 ° 
w e mt 
w a 
e ° 
a =30- ° 4 
= 7 
red, : a ote? : 
2 ss 
- 
h— ° 4 
S$ ~0 © ‘Ge 
« % be 
= 9 5 Sp ‘a 
« os 
. 
=50- ‘ ae 4 4 
° Bo 
6 4 
Os 
3.0 
-60 aS TE Se | ! eH 
100 200 500 1000 2000 5000 10,000 
FREQUENCY IN CYCLES PER SECOND 
T Pala ha atiy ey T il dE 
SUBVECT: B-W 
oe . | 
Fo = 250 CPS 
S 4 Ps = 24 CM H20 4 
@ AREA 
y 710 © VELOCITY 5 
o ° 
wv) r . <q 
wo 
oe 
2 720P - 
w 
o = 4 
2 ° 
e 
a -30- - + 
Z ° 
& . a 
w 
> ° 
700} 4 
J . 
a 
« a ee oa 
7 ° 
° 040 a5 
-S0O-- ° e o- 
os ve e* 
L 4 4 
-60 n ees en" eaisish L @ 101 1 1 te 
100 200 $00 1000 2000 5000 10,000 


FREQUENCY IN CYCLES PER SECOND 


Figure 8. Harmonic amplitude spectra for the glottal area and volume velocity functions 


shown in Figure 6 (Subject B). 


is a fairly reasonable representation 
of the volume flow. 

If a straight line were fitted to the 
spectral envelopes, the best fit would 
have a slope of the order of -10 to -12 
db/octave. To a rough approxima- 
tion, therefore, the amplitudes of the 
spectral components vary inversely 
with frequency, or harmonic number, 
raised to the power 1.7 to 2.0. (This 
relation also was found to apply to 
the area data for the female speaker.) 
In most of the low pitch cases a 


straight line fits the spectral envelopes 
relatively well. In the high pitch cases, 
however, the fit to a straight line is 
fairly good at the low frequency end 
of the spectrum, but the high fre- 
quency end is, in general, more in- 
tense than is indicated by a simple -10 
to -12 db/octave line. 

It is tempting, in the light of the 
foregoing data, to digress and specu- 
late on the physical correlates of voice 
quality. For phonation under es- 








———wE 


Eee 














eS 


er ee Sane 


gh a 





sentially the same conditions, the two 
male subjects (A and B) have ampli- 
tude spectra that are quite similar. 
Their waveforms of glottal area and 
velocity, however, differ considerably 
in shape. It may be expected, conse- 
quently, that the phase spectra of 
their glottal functions also differ. Ex- 
amination of the computed phase 
spectra (not reproduced here) shows 
this to be the case. Despite the way 
in which Helmholtz has been re- 
peatedly interpreted (or rather mis- 
interpreted), it is clear that man is 
not completely ‘phase-deaf.’ It seems 
conceivable, therefore, that one im- 
portant correlate of voice quality 
might be the phase spectrum of the 
glottal source. 


The implication is not that glottal 
‘phase’ is the entire story; most as- 
suredly other factors contribute to 
voice quality. Characteristic patterns 
of pitch inflection and stress can serve 
to label a given voice, as can ‘residual’ 
nasal coupling and _ characteristic 
damping of vocal resonances. Never- 
theless, the phase spectrum of the 
glottal source could play a significant 
role. Considerable psychoacoustic ex- 
perimentation appears to be needed 
before a satisfactory quantification of 
voice quality can be formulated. 


Differential Acoustic Resistance 
and Equivalent Circuit of 
the Glottis 


In the synthesis of speech by analog 
techniques it usually is more desirable 
to make ‘signal’ equivalents of the 
vocal mechanism rather than attempt 
to simulate total quantities, including 
quiescent terms. Since the glottis is, 
in effect, the source of signal energy 
for voiced sounds, it is important to 


FLANAGAN: GLOTTAL SOUND SOURCE III 


know how this source should be rep- 
resented in an analog. 

A ‘small signal’ equivalent circuit 
for the glottal source can be deter- 
mined in essentially the same manner 
that the equivalent circuit for a vacu- 
um tube amplifier is derived. Let 
the instantaneous volume velocity 
through the glottis, Q(t), be written 
as a function of the pressure differ- 
ential across the glottis, P(t) (ie., 
the pressure under the glottis with 
respect to the pressure above the glot- 
tis, or the subglottic pressure), and 
the glottal area, A(t), so that 


Oe) = £ [P@), Ate) ]- (15) 


Each of the time functions can be 
considered as having the form of a 
mean value with a time-varying com- 
ponent superimposed, as 

X(t) Xo + AX 
X. + x(t), (16) 


where X (t) > 0. 


Dropping the explicit notation for 
time and expanding (15) in a Taylor 
series about the quiescent point 
(P.,Ao), yields 


te] 
Q (P,A) aS Q (P.,Ao) + — (P-P.) 
P.,Ao 


2Q 


Sa] (AAs) + ---- 


P.,Ac 
= Q + 4Q = Q. + qit). (17) 





Taking only the first-order terms 
gives for the small signal case 


o) 
“a+ & ap + 2 


SP AA 
P.,Ao 


P.,Ao 








or 





112. JOURNAL OF SPEECH AND HEARING RESEARCH 


22 
oP 


re) 
pt) + = 


sides 3A 
P.,Ao 


a(t) 
PeAo..« (18) 








Equations (18) can be interpreted as 
the terminal relations for a constant 
volume velocity generator with a 
non-infinite internal impedance (anal- 
ogous to the equivalent circuit for a 
pentode vacuum tube). Such a rep- 


| oy — q(t) 
aP 

20 Poo ii 

| ; 


Figure 9. Small-signal equivalent circuit for 
the glottal sound source. 





3 
q,(t) = $9 rat) 





resentation is shown in Figure 9. 
Since the quiescent point (P.,Ao) 
for typical operation of the glottis 
usually places the mean flow in a 
region where the kinetic resistance 
term is the predominating factor, the 
simple relation (7), ie, Q = 6A 
(2P/p)*/?, might be used to estimate 
the differential quantities. Carrying 
out the indicated differentiation yields 





2Q | __ (8Ao)’, 
OP i pQo 
PoAo 
and 
©Q 
dA. _— Q./Ao 
P.,Ao ° (19) 





Relation (19) shows that the resist- 
ance of the glottal orifice to small 
signal variations in volume flow is ap- 
proximately pQ,/ (5A,)?, and is about 
twice the value of the resistance to 
steady flow. In addition, the equiva- 
lent volume velocity generator is one 
which generates an acoustic volume 
velocity equal to the a.c. component 
of the area function, a (t), multiplied 
by the ratio of mean volume flow to 


mean glottal area, Q,/A,. Using some 
typical values for Q, and A,, a com- 
putation can be made of the incre- 
mental resistance of the glottis. Tak- 
ing Q, ~ 150 cm*/sec, Ag ~ 6mm? 
and 6 ~ 0.8 yields 


pQ./ (5Ao) * ~ 73 cgs acoustic ohms. 


If the acoustic impedance presented 
to the glottis by the vocal tract and 
the trachea can be computed, an idea 
can be obtained as to whether the 
glottis functions more nearly as a sig- 
nal source of constant volume ve- 
locity or of constant pressure. For an 
unconstricted configuration, the vocal 
tract might be idealized as a cylindri- 
cal tube, open at one end, about 17 
cm long, and of the order of 10 cm? 
in cross-section. Olson (11) gives data 
on the driving point impedance of 
smooth, hard-walled tubes, open at 
the far end. Except in the vicinity of 
the first normal mode, the impedance 
looking into such a tube is less than 
about 3 pc/A, where A is the cross- 
sectional area of the tube, and c the 
velocity of sound.? For frequencies 
near the first normal mode (approxi- 
mately 500 cps for the unconstricted 
vocal tract) the impedance may be 
as high as 18 pc/A for the hard- 
walled tube. 

If the vocal tract were an uncon- 
stricted lossless tube, the driving point 
impedance would be equal to, or- less 
than, 10 to 15 cgs acoustic ohms at 
frequencies removed from the first 
natural frequency, and would be of 
the order of 70 cgs acoustic ohms at 
the first natural frequency. The. soft- 
walled vocal tract, however, does not 
exhibit resonances as sharp as a hard- 
walled tube, so that these values might 





*For moist air at body temperature, c = 
3.56 x 10° cm/sec and p = 1.12 x 10° gm/ 


cm’, 





= ee 
—_— ew ~ 








eS 


ee 








be considered approximate upper 
limits to the driving point impedance. 
The impedance of the glottis, there- 
fore, appears high compared with the 
driving point impedance of the tract 
except for frequencies in the vicinity 
of the first formant; here the tract 
impedance may be comparable to the 
impedance of the glottis. 

It is possible, in a similar manner, 
to get a rough idea of the acoustic 
impedance looking into the trachea. 
Quantitative data on dimensions of 
the lower respiratory tract, however, 
are very sketchy in the literature. 
This is a result, in part, of the fact 
that the lungs are subject to great 
variation in capacity, and the trachea 
and bronchi distend appreciably dur- 
ing inspiration. The branches of the 
bronchial ‘tree’ terminate in a very 
complex configuration of air sacs, 
composed of spongy, porous tissue. 
Examination of physiology texts leads 
to the impression that the specific 
acoustic impedance looking into the 
trachea, with the lungs moderately 
inflated, probably is of the order of 
pc (i.e., free space), at least for fre- 
quencies above several hundred cps. 
Certainly the lungs represent a highly 
dissipative and absorptive termination 
when moderately inflated. If the lungs 
weré assumed to be a pc termination, 
then the acoustic impedance looking 
into the trachea would be simply pc/ 
At, where A; is the cross-sectional 
area of the trachea. For an adult male 
A; is of the order of 300 mm?, so that 
pc/A, ~ 13 cgs acoustic ohms. This 
value is relatively small compared with 
the differential acoustic resistance cal- 
culated for the. glottis. 

On the low frequency side, ap- 
preciable radiation of low frequency 
sound can be produced from the 
chest by uttering very low-pitched 
sounds in the chest register. If, in such 





FLANAGAN: GLOTTAL SOUND SOURCE 113 


a case, the system below the glottis 
were acting as a simple Helmholtz 
resonator, the tracheal air column 
would be moving essentially in phase 
and resonating with the compliance of 
the lung cavities. If typical dimensions 
are assumed for the subglottic system, 
a computation can be made of the 
resonant frequency of the simple in- 
ertance-compliance elements. Taking 
15 cm and 300 mm? as the approxi- 
mate length and cross-sectional area, 
respectively, of the trachea-bronchi 
equivalent tube, the inertance of the 
tracheal air mass is La ~ 5.6 x 10° 
cgs units. Similarly, taking three liters 
as a typical volume of the lung space, 
the compliance of the lung cavities 
is Cy ~ 2.1 x 10% cgs units. The res- 
onant frequency is, therefore, fy = 
V,r(LaC,)1/? ~ 46 cps. In this simple 
idealization the acoustic reactance 
looking into the neck of the resonator 
(i.e. the trachea) is approximately + 
j 3 cgs acoustic ohms at 100 cps, and 
about + j 18 cgs acoustic ohms at 
500 cps. (The simple resonator ap- 
proximation does not hold well for 
frequencies greater than about 500 
cps since the dimensions of the sys- 
tem become comparable to a wave- 
length.) Again the indication is that 
the impedance looking into the 
trachea is fairly small compared with 
the acoustic impedance of the glottis. 

Since the impedance level of the 
subglottal system is relatively low, 
the equivalent acoustic impedance 
presented by the glottis, when viewed 
from the vocal tract side, is nearly 
the impedance of the. glottis. The ap- 
plication of this result to the excita- 
tion of an electrical analog is ap- 
parent. A current source, having an 
internal impedance analogous to the 
differential resistance of the glottis, 
and generating a current waveform 
proportional to the a.c. component 


114 JOURNAL OF SPEECH AND HEARING RESEARCH 


f(t) 
—1/a 








Figure 10. Idealized waveform of glottal 
volume velocity, f (t), and its spectral 
envelope, F (w). 


of the glottal area function, is not, 
therefore, an unrealistic source of 
excitation. 


If, in the electrical analog, it is not 
possible to have a source that gen- 
erates the a.c. component of the area 
(or velocity) waveform, the next best 
thing is to have a periodic source 
whose amplitude spectrum (but not 
necessarily waveform) is similar to 
that of the glottal source. This is 
equivalent to disregarding the phase 
spectrum of the glottal source, and is 
essentially the way in which most of 
the existing analogs are excited. Al- 
though phase does not appear to in- 
fluence speech intelligibility greatly, 
it might be expected, nevertheless, 
that the quality «f speech synthesized 
from the latter source might not be 
quite as natural as that produced by 
the former. 


‘Efficiency’ of the Glottal Source 


In his studies of internal laryngeal 
activity W. W. Fletcher (6) found 
no consistent relationship between 
amplitude of fold displacement and 
vocal intensity. He did find, however, 


that the element of vibratory motion 
most consistently associated with in- 
tensity of voice was the closed phase 
of the cycle of cord vibration. Also, 
in the lecture notes that accompany 
the Bell Laboratories film, “High- 
Speed Motion Pictures of the Human 
Vocal Cords,’ the observation is made 
that the closure time per cycle is 
greater for trained voices than for 
untrained voices. Further, the state- 
ment is made: ‘ .. . it is this ability 
to better control the flow of air that 
enables the trained voice to radiate 
a greater amount of sound power than 
the untrained voice, or to radiate an 
equal amount of power with a lower 
air flow.’ 

A fairly simple computation will 
illustrate how such relations can arise. 
Let the waveform of glottal volume 
velocity be idealized as a triangular 
wave, and suppose that the fundamen- 
tal frequency and mean volume flow 
are maintained constant while closure 
time per cycle is varied. The volume 
velocity function may be represented, 
therefore, as 


iali-je|[,oeoe| te | asa 

1) = a 
0 jas] t | =T/2, 
(20) 


where T is the fundamental period, 
2a is the time that the wave is in the 
open phase, and 2a = T and f(t) = 
1/T. The envelope of the amplitude 
spectrum for this function is the 
Fourier transform of one cycle, and 


1S 
F (w) _#@G) ‘ (21) 


wa ) 
2 


The function and its amplitude spec- 
trum are illustrated for an arbitrary 
value of a in Figure 10. 





| 


———————————————— 


—_—< oO SF CT > ESE + 





ET SS ee ee 


—— 


ee 


CS eee 











ie 
ZL 


frac 
F(t) 








ed 























° 0.25 0.50 075 1.00 


Figure 11. Relation between duty cycle of 
the glottis, ¢, (ie., ratio of open time to 
total period) and the rms value of the a.c. 
component of the glottal volume flow. 


The question under consideration 
is how the relative closure time or 
‘duty cycle’ of the wave is related to 
the sound energy produced, that is, 
to the rms value of the a.c. compo- 
nent of the wave. Let the glottal duty 
cycle, ¢, be defined as the ratio of 
open time to the total period, or ¢ = 
2a/T. Computing the rms value of 
(20) in the conventional manner 
yields 


[P@)]" = 2/3aT = 2 [39] “7 


T (22) 


In a like manner, the a.c. compo- 
nent of f(t) is 
fac (t) = [f(t) - £(t)] 


a t|],Om|t|=s 





a’ 





—(1/T) ,a |] ct | = T/2, (23) 
and 
Re wpe = [2b ]* 13 
3aT? T| 43¢ 


(24) 


Since f(t) = 1/T, the ratio of the 
rms value of the a.c. component to 
the mean value of the wave is simply 


[fr (0) 7*_( (4-30 
f(t) 








FLANAGAN: GLOTTAL SOUND SOURCE 115 


and a sketch of this relation is shown 
in Figure 11. It is obvious from (24) 
and (25) that if mean flow and fun- 
damental frequency are maintained 
constant, the rms value of the a.c. 
component of the glottal wave in- 
creases as duty cycle decreases, and 
the supply of lung airis converted 
into sound more ‘efficiently.’ It also 
is apparent from (21) and from Fig- 
ure 10 that the spectrum of the glot- 
tal wave becomes ‘richer’ in higher 
harmonics as the duty cycle decreases 
(and the sound intensity increases). 
This, of course, is what has been ob- 
served experimentally from spectral 
measurements on sounds produced at 
different vocal intensities. 


Summary 


Dynamic relations between the area 
of the human glottis and the volume 
flow of air through the glottis dur- 
ing phonation have been discussed. 
Data reported previously on glottal 
area and subglottic pressure have 
been used to deduce waveforms of 
glottal volume flow. Amplitude spec- 
tra for the area and volume velocity 
functions have been computed on an 
IBM 650 computer, and the results 
plotted. A small-signal equivalent cir- 
cuit for the glottal sound source has 
been determined. The differential 
acoustic resistance of the glottis has 
been calculated, and compared with 
the impedance levels of the vocal 
tract and subglottic system. Applica- 
tion of the results to the synthesis of 
speech by analog techniques has been 
indicated. 


References 


1. Curry, R. O. L., The Mechanism of the 
Human Voice. New York—Toronto: 
Longmans, Green and Co., 1940. 








116 


10. 


JOURNAL OF SPEECH AND HEARING RESEARCH 


. Dunn, H. K., The calculation of vowel 


resonances and an electrical vocal tract. 
J. acoust. Soc. Amer., 22, 1950, 740-753. 


. Fant, C. G. M., Speech communica- 


tion research. Ingen. Vetensk. Akad. 


(Stockholm), 24, 1953, 331-337. 


. FarnswortH, D. W., High-speed mo- 


tion pictures of the human vocal cords. 
Bell Labs. Rec., 18, 1940, 203-208. 


. Franacan, J. L., and House, A. S., De- 


velopment and testing of a formant- 
coding speech compression system. J. 
acoust. Soc. Amer., 28, 1956, 1099-1106. 


. Firercuer, W. W., A study of internal 


laryngeal activity in relation to vocal 
intensity. Ph.D. Thesis, Northwestern 
University, 1950. 


. Gutzmann, H., Physiologie der Stim- 


me und Sprache. Braunschweig: Vie- 
weg und Sohn, 1909. 


. House, A. S., Analog studies of nasal 


consonants. JSHD, 22, 1957, 190-204. 


. House, A. S., and Stevens, K. N., Ana- 


log studiés of the nasalization of vowels. 
JSHD, 21, 1955, 218-232. 

Jupson, L. S., and Weaver, A. T., Voice 
Science. New York: F. S. Crofts and 
Co., 1942. 


11. 


is 


13. 


14. 


16. 


avs 


Otson, H. F., Elements of Acoustical 
Engineering, Second Edition. New 
York: D. van Nostrand Co., 1947. 
Stevens, K. N., and House, A. S., De- 
velopment of a quantitative description 
of vowel articulation. J. acoust. Soc. 
Amer., 27, 1955, 484-493. 


Srevens, K. N., Kasowski, S., and 
Fant, C. G. M., An electrical analog 
of the vocal tract. J. acoust. Soc. Amer., 
25, 1953, 734-742. 


VAN DEN Berea, J. W., Direct and in- 
direct determination of the mean sub- 
glottic pressure. Folia Phoniatr., 8, 1956, 
1-24. 


. VAN DEN Bere, J. W., Zantema, J. T., 


and Doornensat Jr. P., On the air 
resistance and the Bernoulli effect of 
the human larynx. J. acoust. Soc. 
Amer., 29, 1957, 626-631. 


Weert, R. L., Theory of vibration of 


the larynx. Bell Syst. tech. J., 9, 1930, 
207-227 


Westervett, P. J.. and McAutirre, C. 
E., Differential resistance and reactance 
of sharp-edged circular orifices. Quart. 
Rep., Acoustics Lab., Mass. Inst. Tech., 
1950, July-September, 15-16. 





oe 








ee 











— ¥ 


——— rr 





TR 


Frequency-Intensity Relationships 
And Optimum Pitch Level 


Wayne L. Thurman 


Fundamental to many procedures em- 
ployed in vocal retraining is the belief 
that for each person there is an ‘opti- 
mum’ or ‘natural’ pitch level at which 
the human vocal apparatus operates 
with the greatest efficiency (1, 3, 5, 6, 
8). Various clinical techniques have 
been described for locating the opti- 
mum pitch level for the speaking 
voice. Wentworth (7) found eight 
different methods suggested in 14 
texts. Pronovost (4) described and 
experimented with nine. 

Prominent among clinical tech- 
niques for locating optimum pitch 
level is the procedure of having the 
subject hum or sing scales to deter- 
mine the level or closely grouped 
series of tones at which vocalization 
is produced ‘most efficiently.’ In the 
most commonly recommended pro- 
cedure the client is instructed to hum 
up and down his total pitch range and 
the clinician (6), or the client himself 
(1), listens to the performance and 
identifies the tone or series of tones 
on which an involuntary swell of in- 
tensity occurs. This is taken to be an 
indication of the region of highest 


Wayne L. Thurman (Ph.D., Purdue Uni- 
versity, 1953) is Associate Professor of 
Speech and Director of the Speech and 
Hearing Clinic, Eastern Illinois University. 
This article is based on an M.A. thesis com- 
pleted at the State University of Iowa 
under the direction of Dr. James F. Curtis. 


operating efficiency for his voice, or 
the ‘optimum’ pitch level. West (8) 
states that for male voices such in- 
tensity swells may occur at two dif- 
ferent levels. 

This technique implies the regular 
and stable occurrence of certain fun- 
damental characteristics of behavior 
on the part of both subjects and ob- 
servers. Whether or not these neces- 
sary types of behavior are, indeed, 
characteristic of the singing or hum- 
ming of scales are questions of fact 
which can be investigated experimen- 
tally. The purpose of the experiment 
herein reported was to seek answers 
to the following such questions: (1) 
Do involuntary sound-level maxima 
occur characteristically when subjects 
sing or hum up or down the musical 
scale? (2) To the extent that such 
maxima are found, do they occur 
repeatedly in a given part of a sub- 
ject’s total pitch range? (3) Are such 
maxima characteristically located in 
the part of the pitch range which is 
appropriate (in terms of other cri- 
teria) for the location of optimum 
pitch level? (4) Can observers reliably 
identify sound-level maxima (or 
swells) in the scales sung or hummed 
by subjects? 


Procedure 


Briefly summarized, the procedure 
consisted in (1) recording scales as 


—1i7—. 





118 JOURNAL OF SPEECH AND HEARING 


sung and hummed by a group of sub- 
jects, (2) submitting these recordings 
to sound-level measurement to deter- 
mine the nature and location of sound- 
level variations, and (3) playing the 
records to groups of trained listeners 
who indicated the location of any 
observed swells in loudness. 


Because optimum pitch is related 
to speaking performance during which 
the vocal resonance cavities must take 
a variety of configurations for the 
production of vowels, scales sung on 
vowels were examined in addition to 
scales for which the subjects merely 
hummed, although the latter is the 
most commonly prescribed clinical 
procedure. To sample the range of 
such cavity configurations three vow- 
els, [e}, [a], and [u] were employed. 


Subjects. Thirty volunteer subjects 
were selected from public speaking 
and speech pathology classes, 15 male 
and 15 female. Their ages ranged from 
18 years to 42 years with a mean of 
25 years. To avoid possible bias result- 
ing from training, no subject was ac- 
cepted who had taken private singing 
instructions or who had sung scales 
for practice. No subject was accepted 
who had a voice quality defect in the 
judgment of the experimenter. 


Recordings. The recording room 
measured approximately 10 feet by 12 
feet and was heavily draped to mini- 
mize reverberation. Disc recording 
equipment of good quality was used. 
Each subject was placed about 16 


*A Turner, Model 211, dynamic micro- 
phone and a Presto, Model 6N, disc re- 
corder were employed. Recordings were 
made on lacquer-coated aluminum, Red 
Label Audiodiscs. 


RESEARCH 


inches in front of the microphone 
with a positioning brace against his 
trunk to maintain constant distance. 
Treble and bass controls on the re- 
cording amplifier were set for flat 
frequency response and the gain con- 
trol was held constant during each 
subject’s performance. 

After the experimenter demon- 
strated the discrete step technique of 
singing a scale, each subject practiced 
a scale with the vowel [a]. He was 
then asked to locate his lowest tone 
by singing down from any middle 
tone. This was repeated and a record- 
ing was made of his singing from the 
lowest to the highest tones of his 
range including falsetto on [a]. The 
procedure was repeated with [e] and 
[u]. Finally, each subject was asked 
to hum continuously, but approxi- 
mately in musical steps, up and down 
his vocal range including his lowest 
tone and his highest tone through 
falsetto, taking a breath whenever he 
wished. Thus each subject performed 
five scales, singing up his total range 
on [a], [e] and [u], and humming 
both up and down his total range, 
each time including falsetto. 


Measurements. The phonophoto- 
graphic technique described by 
Cowan (2) was used to measure the 
frequency of each note sung or 
hummed. Since discrete notes were 
sung on the three vowel scales, identi- 
fication of the notes was accomplished 
easily. On the hummed scales, silent 
periods, when the :«bjects stopped 


for breath, were used as reference 
points to assist in identification of 
each note. An average was taken of 
10 fundamental periods from the 
middle of each note and converted to 
frequency in cycles per second. Con- 





wor 


rr er 


ns 


—— 


——— ee eee 








~oe 


atic eacaiiemaieala eee 


eee 
ee 








tinuous graphic sound-level records 
were made from the phonograph 
discs.2 Each note on the vowel scales 
was easily identified on the sound- 
level records and monitoring with 
headphones made it possible for the 
experimenter to mark the notes on the 
graphic level records of the hummed 
scales. 


Calibration of Phonographic Re- 
cording and Play-back System. A con- 
stant voltage audio-oscillator signal 
was introduced at the microphone 
input of the recording amplifier and 
recorded on the same type disc used 
for the experimental recordings. 
Seventeen discrete frequencies spaced 
from 50 to 1500 cycles per second 
were recorded. This calibration rec- 
ord was played back into the graphic 
level recording system, as described 
above, using the same amplifier set- 
tings as for the experimental record- 
ings. This procedure yielded a curve 
showing the over-all frequency re- 
sponse characteristics of the phono- 
graphic recording and _play-back 
equipment, exclusive of the micro- 
phone. This curve was then corrected 
for the frequency response of the 
microphone by means of a calibration 
curve furnished by the manufacturer. 
This final calibration curve was used 
to correct graphic level readings for 
all the scales sung by the subjects. 
From the data thus obtained a graph 
of sound level versus frequency was 
plotted for each of the 150 scales. 


Judgmental Procedure. Thirty per- 
sons with training in making critical 





*The equipment consisted of the follow- 
ing: a General Electric variable-reluctance 
Leen A an H. H. Scott, Model 210-A, 
amplifier; and a Sound Apparatus Company 
High Speed Energy Level Recorder, Model 
PL-E. 





THURMAN: OPTIMUM PITCH LEVEL 119 


and analytical observations of voices 
(university speech clinic staff mem- 
bers, seniors, and graduate students in 
speech pathology) served as observers 
in the second part of the experiment. 
The recordings of the hummed scales 
were played with good quality equip- 
ment* to the group of listeners seated 
in a classroom. Each observer was 
provided a series of forms on which, 
for each subject (identified numeri- 
cally), a series of digits represented 
the notes hummed. It was possible for 
the observers to follow the numbers 
as they heard the ascending and de- 
scending scales from the phonograph 
discs. The highest note in each per- 
formance was underlined as a refer- 
ence point. The observers were in- 
structed to circle, on both ascending 
and descending performances, the 
number or numbers of the notes on 
which they heard swells in loudness. 
Each recording was played twice and 
as Many more times as was requested 
by the observers. 


Results and Discussion 


Sound Level Versus Frequency 
Curves. In order to facilitate analysis 
and discussion, the sound-level versus 
frequency graphs were classified as 
follows: (1) single-peaked curves, 
with a single maximum of five decibels 
or more; (2) double-peaked curves, 
with two sound-level peaks exceeding 
the adjacent portions of the curve by 
five decibels or more; (3) generally 
rising curves, with an over-all increase 





*The equipment consisted of the follow- 
ing: a General Electric variable-reluctance 
pickup; an H. H. Scott, Model 210-A, am- 
plifier; and a Jensen HNP-51, extended 
range coaxial, fifteen-inch speaker in a bass 
reflex cabinet. 





120 JOURNAL OF SPEECH AND HEARING 


CLASSES OF CURVES 











RESEARCH 


TABLE 2. Distribution showing number of scales 
sung or hummed by each subject which showed 























“ad SINGLE PEAK sound level maxima of at least five decibels 
cet a las (single-peaked or double-peaked classification). 
20+ PS ares XX ey 
oe No of Noo Peroodage of 
curves subjects all subjects 
ed DOUBLE PEAK ~\ 
30F Subject WII hum, down — 
2sb r rN vy 0 3 10.0 
zo} — v \ 1 8 26.7 
S55 Pele) 200 BOO 2 5 16.7 
Pa Ss id 3 11 36.7 
2 35r GENERALLY RISING 4 2 6.6 
S& 30F Subject YI hum, up J 5 1 3.3 
© 25- af 
ee ee Totals 30 100.0 
s = "5100 300 TOO abo a 
3 3[ GENERALLY FALLING classification is presented in Figure 1. 
oy 30) Subject XVI [el] 
es a ae oe Table 1 shows the number of 
15k 30 ab as graphs that were classified in each of 
— the five groups. It may be noted that 
35 
sof “Subvect XIZ tu) only 54, about 36 per cent of the total 
mK : , 
= | nee 150 scales, showed a single maximum 
S60 mcm.) a0 00 


Ficure 1. Examples of the five classes into 
which sound-level versus frequency curves 
were grouped. 


of at least five decibels accompanying 
the increase in frequency; (4) gener- 
ally falling curves, with a five-decibel 
or greater drop accompanying the rise 
in frequency; (5) level curves, with 
no sound-level variations as great as 
five decibels. An example of each 


as predicted by the clinical technique 
in question. If the double-peaked 
curves, which West (8) suggested as 
typical for men, are considered, 10 
more are added, making a total of 64, 
approximately 43 per cent of all the 
scales whose general sound-level pat- 
terns showed one of the two types of 
predicted characteristics. 

The hummed scales showed more 
single-peaked variations than did any 
of the vowel scales. Thirty-one of 60 


TaBLE 1. Frequency of occurrence for each of five classes of sound-level versus frequency curves. 








Scale 


Single 


Double 





Rising Falling Level Totals 
Peak Peak 
No % No % No % No % No % No %G 
[a] 8 26.7 0 0.0 9 30.0 4 13.3 9 30.0 30 100 
{ul 7 23.3 4 13.3 13 43.3 3 10.0 3 10.0 30 100 
[e] 8 26.7 1 3.3 8 26.7 65 16.6 8 26.7 30 100 
Humming 
Upward 14 46.7 3 10.0 6 20.0 3 10.0 4 13.3 30 100 
Humming 
Downward 17 56.7 2 6.7 6 20.0 4 13.3 1 3.3 30 100 
Totals 54 36.0 10 6.6 42 28.0 19 12.7 25 16.7 150 100 











————————e 


eee 








ng a Ng RR 





hummed scales were classified in this 
group. When double-peaked scales are 
included, 36 of the 60 hummed scales 
showed one of the two types of pre- 
dicted sound-level variations. 

The vowel scales showed no general 
trend toward a particular pattern. 
Only 28 of all of the 90 vowel scales 
showed a single- or double-peaked 
pattern. In all three types of vowel 
scales—[a], [e], [u]—the generally 
rising pattern described as many, or 
more, of the performances than did 
the single-peaked pattern. 

Table 2 indicates the number of 
single- or double-peaked performances 
each of the subjects produced. It may 
be seen that measurements for only 
three of the 30 subjects gave five- 
decibel sound-level maxima (single or 
double peaks) on as many as four of 
their five performances. For 16 of 
the subjects, not more than two of the 
five performances showed sound-level 
peaks (single or double) as great as 
five decibels. Three of the subjects 
performed no scales in such manner 
as to give the predicted sound-level 
swells. 

Thus, the first two of the four 
questions posed in the beginning must 
be answered in the negative. The oc- 


THURMAN: OPTIMUM PITCH LEVEL 121 


currence of a significant peak (or 
peaks) in sound level does not seem to 
be characteristic of the scales sung and 
hummed by these subjects. Moreover, 
repeated performances by the same 
subjects did not show stable sound 
level versus frequency patterns. 


Location of Five-Decibel Peaks. 
The clinical technique requires not 
only that sound-level maxima occur, 
but that the peaks occur within that 
part of the individual’s range where 
experience and research have shown 
that his best pitch level is likely to 
be located. Ten of the 14 writers on 
the subject cited by Wentworth (7) 
hold that optimum pitch level should 
be located at or below the middle of 
a subject’s singing range. Pronovost 
(4) found that for superior males the 
median pitch levels which he assumed 
to be their best speaking levels were 
approximately 25 per cent of their 
total singing ranges above their lowest 
sustainable tones. It seems reasonable, 
further, that optimum pitch level 
should not fall below a point about 
20 per cent above the lowest note of 
the subject’s range. For the person 
with an average range, use of a median 
level lower than this 20 per cent point 


TaBLE 3. Location relative to total singing range of sound-level maxima (single five decibel 
peaks plus the lower frequency maximum of double peaked curves). Percentages of singing range 
are fractions of total range including falsetto above the lowest sustainable note, measured in 


units of the equal tempered musical scale. 








Number of Scales with Five Decibel Peaks 





Percentage of Singing Range Males Females Totals 
No. % 0. % No. % 
00.0- 9.9 0 00.0 1 3.0 1 1.6 
10.0-19.9 12 38.7 2 6.1 14 21.9 
20.0-29.9 8 25.8 3 9.1 11 17.2 
30.0-39.9 1 3.2 6 18.2 7 10.9 
40.0-49.9 3 9.7 4 12.1 7 10.9 
50.0-59.9 1 3.2 7 21.2 8 12.5 
60.0-69.9 1 3.2 8 24.2 9 14.1 
70.0-79.9 3 9.7 2 6.1 5 7.8 
80.0-89.9 0 0.0 0 00.0 0 00.0 
90.0-99.9 2 6.5 0 00.0 2 3.1 
Totals 31 100.0 33 100.0 64 100.0 











122. JOURNAL OF SPEECH AND HEARING RESEARCH 


would severely restrict the extent of 
possible downward pitch variation. 
Therefore, the 20. and 50 per cent 
points on a subject’s total singing 
range were chosen as reasonable limits 
for the location of optimum pitch. 

Table 3 indicates the locations, 
within the subject’s total singing 
ranges including falsetto, of the max- 
ima in the single-peaked curves and 
the lower of the two maxima of the 
double-peaked curves. For the male 
subjects 39 per cent of the peaks fell 
at or below the 20 per cent points on 
their total singing ranges. About 23 
per cent of such peaks fell above the 
50 per cent point. For the women, 
nine per cent of such peaks fell below 
the 20 per cent point, and about 52 
per cent fell at or above the 50 per 
cent point. In all, only 25 peaks or 39 
per cent of the maxima from all 64 
single- or double-peaked curves fell 
within the reasonable limits for an 
optimum pitch, the 20 to 50 per cent 
points of the subjects’ total singing 
ranges. 

It may be seen, therefore, that not 
only did the predicted sound-level 
maxima fail to occur consistently, but 
of those which did occur, only 39 per 
cent (17% af all 150 performances) 
were located within limits that might 
be considered reasonable for an op- 
timum pitch level. 


Judgmental Procedure. Examination 
of the percentages of observers select- 
ing a particular note as the center of 
a loudness well showed a wide scatter 
of judgments over the entire fre- 
quency range on most of the scales. 
In only a few very exceptional cases 
was there anything approaching una- 
nimity of opinion as to the location 
of the loudness maxima. In this con- 
nection it is relevant to take note of 
the subjective reactions of the ob- 


servers to their tasks. The nearly 
unanimous reaction was that the judg- 
ments were exceedingly difficult to 
make, even after the observers heard 
the records played a number of times. 

Examination was made of cases in 
which a reasonable minimum agree- 
ment was reached among observers, 
i. €., those in which 60 per cent or 
more of the judgments were concen- 
trated on three or fewer consecutive 
notes of the scale. It was found that 
33 of the 60 hummed scales met this 
criterion. In 27 of those 33, definite 
sound-level maxima were shown on 
the graphs. Examination of the points 
thus located by 60 per cent or more 
of the observers disclosed that 16 such 
identified maxima fell below the mid- 
dle of the subject’s range while 17 fell 
above the middle. 

The judgmental procedure indi- 
cated, then, that identification of loud- 
ness swells in the scales sung by the 
subjects was a difficult task—one that 
observers with considerable training 
could not perform reliably in most 
instances. Even on those scales with 
respect to which a fair degree of ob- 
server agreement was obtained, more 
than half of the points selected as 
representing such swells were at pitch 
locations which were not appropriate 
(according to other criteria) for the 
location of subjects’ optimum pitch 
levels. 


Summary 


The purpose was to examine ex- 
perimentally the procedure for esti- 
mating natural, or optimum, pitch 
level which requires subjects to sing 
or hum the musical scale while an 
observer attempts to locate the pitch 
at which an involuntary swell of loud- 
ness occurs. Thirty subjects having 





— 











— 


-—o~ 





} 
| 





normal voice quality were recorded 
on good quality phonographic equip- 
ment while singing up the musical 
scale on the vowels [a], [e] and [u] 
and while humming both upward and 
downward over their entire vocal 
ranges including falsetto. For each 
note of the 150 scales thus recorded a 
fundamental frequency measurement 
and a measurement of relative sound 
level were made. 

The results showed little consist- 
ency in the sound level versus fre- 
quency patterns for the scales. Sig- 
nificant maxima (five db or greater) 
occurred on less than half of the 150 
scales and a minority of those which 
did occur were found within limits 
considered reasonable for the location 
of optimum pitch. Among the five 
scales produced by each subject there 
was little consistency in the pattern of 
sound-level variation. Observers found 
selection of pitch locations of loudness 
maxima difficult and their results 
showed poor agreement. 

Neither the results of the physical 
measurements nor the judgments of 
loudness variations by observers sup- 








THURMAN: OPTIMUM PITCH LEVEL 123 


port the clinical technique in question 
for estimating optimum pitch level. 


References 


1. Anverson, V. A., Training the Speak- 
ing Voice. New York: Oxford Univer- 
sity Press, 1942. 

2. Cowan, M., Pitch and intensity char- 
acteristics of stage speech. Arch. Speech, 
Suppl., 1, 1936, 1-92. 

3. Farsanxs, G., Voice and Articulation 
Drillbook. New York: Harper, 1940. 

4. Pronovost, W., An experimental study 
of methods for determining natural and 
habitual pitch. Speech Monogr., 9, 1942, 
111-123. 


5. SrrotHer, C. R., Voice Improvement. 
In Foundations of Speech, ed. by J. 
O'Neill. New York: Prentice-Hall, 
1946. 


6. Van Riper, C., Speech Correction 
Principles and Methods (3rd ed.). New 
York: Prentice-Hall, 1954. 


7. Wentworth, E. T., Survey of methods 
for improvement of pitch usage in 
speech as presented in twenty-five cur- 
rent speech texts. M.A. thesis, State 
University of Iowa, 1940. 

8. West, R. W., AnssBerry, M., and Carr, 
A., The Rehabilitation of Speech (3rd 
ed.). New York: Harper, 1957. 





Listener Evaluations 
Of Speech Interruptions 


Dean E. Williams 


Louise R. Kent 


Certain kinds of interruptions which 
occur in normal speech are often not 
discernible from those popularly con- 
sidered to constitute stuttering. The 
assumption, however, that individuals 
are able to discriminate consistently 
between ‘stuttered’ and ‘norma!’ 
speech interruptions is still commonly 
accepted. This is expressed explicitly 
by West (4) in the statement, ‘.. . 
everyone but the expert knows what 
stuttering is.’ This assumption has been 
sustained in spite of an accumulation 
of evidence to the contrary (J, 2, 3). 


The present study was designed to 
test the following hypothesis: Indi- 
viduals do not consistently respond 
to interruptions in speech as either 
stuttered or normal. The specific prob- 
lem was to determine whether in- 
dividuals are more likely to classify 
interruptions as ‘stuttered’ when in- 
structed to listen for stuttered inter- 
ruptions and, conversely, more likely 
to classify the same interruptions as 
‘normal’ when instructed to listen for 
normal interruptions. 





Dean E. Williams, (Ph.D., State Uni- 
versity of Iowa, 1952) is Assistant Professor 
of Speech and Theatre, Speech and Hearing 
Clinic, Indiana University. Louise R. Kent, 
(M.A., Indiana University, 1957) is cur- 
rently in private practice in Baton Rouge, 
Louisiana. 


Stimulus Material 


The stimulus material was a 900- 
word tape-recorded speech. The 
speech contained 52 speech interrup- 
tions distributed among six types of 
nonfluencies as follows: syllable repeti- 
tion — one-syllable (3), two-syllable 
(3) and three-syllable (4); prolonga- 
tions (9); interjections (13); word 
repetitions (4); phrase repetitions 
(5); revisions (11). The syllable repe- 
titions consisted of one, two and three 
repetitions of the initial syllable of 
words. Only syllables beginning with 
consonant sounds were included and 
no particular sounds appeared more 
frequently than any other. Of the nine 
prolongations, lasting one second each, 
six were in the initial position of words 
and three were in the medial posi- 
tion of words; no sound was dupli- 
cated. The interjections were ‘er’ and 
‘ah’; four occurred at the end of 
phrases, one at the end of a sentence 
and eight between words. Word and 
phrase repetitions should be self- 
explanatory. The revisions were 
divided three, six and two, in the 
initial, medial and final positions of 
sentences, respectively. 

The recorded voice was that of a 
normal speaker with a clinical knowl- 


~~ 124 





~e 


- = a a 








a 








—— ee 


ee 





edge of speech correction. A casual, 
conversational style of delivery was 
simulated; the speaker avoided notice- 
able changes in reading rate and in- 
flection in order to make the contrived 
interruptions appear as natural as pos- 
sible. Criterion judgments were ob- 
tained from three trained speech cor- 
rectionists. 


In order to test the assumption that 
the stimulus material was not clearly 
discriminable as either stuttered or 
non-stuttered speech, 32 subjects (15 
males and 17 females) in introductory 
public speaking classes who did not 
participate in the experiment proper 
were asked to listen to the speech. 
These subjects were instructed to 
judge it according to a rating sheet 
ordinarily used by the instructor in 
rating the speeches of students in 
public speaking classes. The students 
were also instructed to record on the 
rating sheets their personal impression 
of the speech and the speaker along 
with any criticisms or comments. 
These students were told that the 
speaker was a person in an introduc- 
tory public speaking class; that is, 
they were not told that the speaker 
stuttered. 

The ratings made by the students 
were on such qualities as organization, 
interest, grammar, articulation and 
vocabulary, and are not of particular 
interest to this study. The comments, 
however, are of interest. Fourteen of 
the students used the word ‘stuttering’ 
or ‘stammering’ in their comments; 
five mentioned ‘speech defect,’ but not 
‘stuttering’; and 13 made no reference 
to speech defects. 

Some of the comments made by 
those students who mentioned no 
speech defect are interesting in that 
they represent a variety of reactions 
to the speech. Abstracts from these 
comments are as follows: 





WILLIAMS AND KENT: EVALUATIONS OF SPEECH INTERRUPTIONS 





125 


Body (of speech) was confusing and 
poorly organized. 

Dull. 

He was sort of a George Gobel type 
speaker. 

Was it the speaker’s purpose to slur some 
of his words? 

His time and articulation was rather poor. 


His words are in too much of a mono- 
tone. 


His speech wasn’t well learned. 

He is not sure of his speech. 

Sounds like ‘Just Plain Bill’ after taxes. 
His voice was raspy and distracting. 

It sounded like a poor job of memorizing. 


It was concluded that the speech non- 
fluencies inserted into the recorded 
speech were (1) sufficiently obvious 
to attract the attention of many ob- 
servers and (2) not deviant to the ex- 
tent that the speaker was evaluated as 
a stutterer by more than one-half of 
the observers. 


Procedure 


A total of 70 subjects (38 males and 
32 females) were used in the experi- 
ment proper. Group I was composed 
of 36 subjects (22 males and 14 
females), while Group II was com- 
prised of 34 subjects (16 males and 18 
females). These subjects were re- 
cruited from freshman and sophomore 
classes at Indiana University; no sub- 
jects had received training in speech 
correction. 


Subjects in the two experimental 
groups were told that they would hear 
a recorded speech given by a person 
who stuttered. Each subject was pro- 
vided with a mimeographed copy of 
the speech; no indication of the speech 
interruptions appeared on this copy. 
Both groups were instructed to follow 
the speech on their copies as they 
listened to the recording and to mark 
through the words or spaces between 


126 JOURNAL OF SPEECH AND HEARING RESEARCH 


TaBie 1. Actual versus obtained percentage distribution of types of interruptions marked under 


instructions to mark all interruptions. 








Categories of Interruptions 





TOTAL 

Syllable repetition 
One-syllable 
Two-syllable 
Three-syllable 

Prolongations 

Interjections 

Word repetitions 

Phrase repetitions 


Revisions 


Percentage Distribution 
Act Obtained 
Stimulus Material GroupI Group II 

100.0* 93.7* 94,2* 
5.8 5.4 5.3 
5.8 5.7 5.7 
Vet Vet 7.6 
17.3 1721 16.9 
25.0 20.9 21.5 
| 7.3 7.2 
9.6 9.3 9.3 
21.1 20.3 20.7 








*Based on actual total of 52 interruptions. 


words where they heard interruptions. uttered which do not necessarily de- 


This procedure was repeated three 
times for each group under three sets 
of instructions. 

When the record was played for 
the first time to Group I, the instruc- 
tions were to mark the stuttered inter- 
ruptions. The papers were then col- 
lected and new sheets distributed. 
When the recording was played the 
second time, the instructions were to 
mark al] interruptions. The papers 
were again collected and new sheets 
distributed. The instructions given be- 
fore the recording was played for the 
third time were to mark the zormal 
interruptions. 

For Group II the order of instruc- 
tions was reversed. The subjects were 
instructed first to mark normal inter- 
ruptions, second to mark all interrup- 
tions and third to mark stuttered 
interruptions. 

Before the record was played for 
the first time for either group, the ex- 
perimenter pointed out that in the 
normal course of speaking, interrup- 
tions may be, and frequently are, 


note stuttering. It was emphasized 
that there were no right nor wrong 
answers, that the papers would not be 
graded in any way, that subjects need 
not put their names on the papers and 
that the purpose of the study would 
be defeated if the subjects copied from 
each other. The subjects were in- 
structed not to mark pauses. 

The middle or second condition in 
which the subjects of both groups 
were instructed to mark all interrup- 
tions was essentially a control condi- 
tion used to determine whether the 
groups differed as to the number of in- 
terruptions responded to as such. The 
subjects in Group I responded on the 
average to 93.7% of the total of 52 
interruptions and subjects in Group II 
responded on the average to 94.2% 
of the total. 

In Table 1 the actual percentage 
distribution of the different types of 
interruptions inserted in the stimulus 
material are compared with the per- 
centage distributions obtained from 
the two groups under instructions to 





— 











ae 





— 





WILLIAMS AND KENT: EVALUATIONS OF SPEECH INTERRUPTIONS 127 









GMM sturrerco riast 


(J norma, cast 


MARKED 








% OF POSSIBLE INTERRUPTIONS 


Figure 1. Percent of possible interruptions 
marked on eight categories by Group I, 
which marked stuttered interruptions first, 
normal Jast. 


mark all interruptions. It is clear that 
the subjects in both groups responded 
at frequencies closely approximating 
the actual relative frequencies of the 
types of interruptions inserted into the 
speech. The results also reflect favor- 
ably upon the fidelity of the record- 


ing. 
Results 


The responses from the two groups 
of subjects yielded data corresponding 
to the two sets of instructions: (a) 
mark the stuttered interruptions and 
(b) mark the normal speech interrup- 
tions. 

The mean number of times that a 
particular type of interruption was 
marked was computed separately un- 








3S 
° 





GBB storrerco cast 


() worwa rinst 


~ 2 «© 
© 0 5 


e 
° 


% OF POSSIBLE INTERRUPTIONS MARKED 
> ° é 
° ° 














eeitione 


roel one ayl. — two ay three a prolonge- word = phrane = inter= 
oe. rep. rep tone rep. 1 jectone 


Figure 2. Percent of possible interruptions, 
marked on eight categories by Group II, 
which marked normal interruptions first, 
stuttered last. 


der the two sets of instructions for 
both groups. For each type of inter- 
ruption, four means were obtained: 
1, that for Group I under instructions 
to mark stuttering, 2. that for Group I 
under instructions to mark normal in- 
terruptions, 3. that for Group II under 
instructions to mark normal interrup- 
tions and 4. that for Group II under 
instructions to mark stuttering. Since 
the number of interruptions in the 
stimulus material varied from one type 
of interruption to another, the ob- 
tained means were converted to per- 
cent of interruptions possible of a 
given type. For example, when in- 
structed to mark stuttered interrup- 
tions, Group I marked a mean num- 
ber of 8.7 instances of prolongations. 
There were actually 9 instances of 
prolongations in the stimulus material, 
the subjects responded to 96.6% of 
the total possible. These data are pre- 
sented in Figures 1 and 2. 

Inspection of Figures 1 and 2 re- 
veals that the order of instructions 
introduced a bias. Group I, instructed 
first to mark stuttered interruptions, 
classified more interruptions as stut- 
tered than they subsequently marked 
as normal; and conversely, Group II, 
instructed to mark normal interrup- 
tions first, classified more interruptions 
as normal than they subsequently 
marked as stuttered. The bias is most 
clearly demonstrated in the categories 
of word repetition, phrase repetition 
and interjections. cog I, marking 
stuttered interruptions first, classified 
more interruptions in these categories 
as stuttered than as normal; while 
Group II, marking normal interrup- 
tions first, classified more of them as 
normal than as stuttered. The bias, 
although discernible, was much less 
marked in the categories of revisions, 
syllable repetitions and prolongations. 
In these categories the two groups 





128 JOURNAL OF SPEECH AND HEARING RESEARCH 


were in relative agreement in spite of 
the bias introduced by the order of 
instructions. Revisions were primarily 
considered normal interruptions while 
syllable repetitions and prolongations 
were primarily considered as stutter- 
ing. 

The procedure used in this study 
allowed the subjects to reverse their 
judgments. Under instruction to mark 
stuttered interruptions, a subject might 
mark a particular interruption as 
‘stuttered,’ and later under instruc- 
tion to mark normal interruptions, 
might judge the same interruption to 
be ‘normal.’ 


The number of instances of incon- 
sistent responses on each category of 
interruption was tabulated for each 
group. Then, for each group, these 
numbers were divided by the total 
number of responses made in the cor- 
responding category under (a) in- 
structions to mark stuttered interrup- 
tions and (b) instructions to mark 
normal interruptions. The quotients 
obtained represent a ratio between the 
total number of responses made in a 
particular category (under one or the 
other set of instructions) and the 
number of inconsistent responses made 
in this same category. This ratio is 
termed an Index of Confusion because 
it is employed as a measure of the 





80 (J chove 1 = marKeo sTUTTERED 
INTERRUPT ONS FIRST 


GROUP = MARKEO STUTTESED 
60 NTERPUPTIONS LAST 


OF CONFUSION 


NOEx 














ene Mee by, prolonga- word Orrose | irter= evibions 
reo, trons 100. rep. retons 


Figure 3. Indexes of confusion on eight 
categories of interruptions when subjects 
were instructed to mark stuttered inter- 
ruptions. 


degree of confusion of the subjects 
as to whether a particular type of in- 
terruption should be judged stuttered 
or normal. 


An example of the computational 
procedure used is as follows. The sub- 
jects in Group I, when instructed to 
mark stuttered interruptions, classified 
193 instances of revisions as stuttering. 
When asked to mark normal interrup- 
tions, they reversed judgment on 138 
of these instances. There were, then, 
138 inconsistent responses in the cate- 
gory of revisions. Here the Index of 
Confusion is 71.5, which is interpreted 
to indicate a high degree of confusion 
on the part of the subjects in Group I 
as to whether revisions should be class- 
ified as stuttered interruptions. The 
Indexes of Confusion for the two 





eo (7) cRouP 1 -maRKEO NORMAL LAST 
Zi 

Pa 4 2 cacuri-warneo norwar rast 
Fs iy 
a 70) J 
> ] 
3 60] Z 
50 Z 
° 40 Z 
zg 
x 30 3 
~ 4% 
z 20 4 
10 A 

! 
































two sy three Syl, prolonga- word == phrase = mter~ revisions. 
o rep, Mons 180. reo, yectoons 


Ficure 4. Indexes of confusion on eight 
categories of interruptions when subjects 


were instructed to mark normal interrup- 
tions. 


groups are presented in Figures 3 
and 4. The results of the analysis of 
inconsistent responses support and 
complement the data depicted in 
Figures 1 and 2. Figure 3 shows that 
when subjects were instructed to mark 
stuttered interruptions, they evidenced 
less confusion on the categories of 
syllable repetitions and prolongations 
than on the categories of revisions, 






































WILLIAMS AND KENT: EVALUATIONS OF SPEECH INTERRUPTIONS 129 


interjections, word repetitions and 
phrase repetitions; hence, the subjects 
were more positive that syllable repe- 
titions and prolongations were stut- 
tered interruptions and less positive on 
revisions, interjections, word repeti- 
tions and phrase repetitions. 

Figure 4 shows that when subjects 
were instructed to mark normal inter- 
ruptions, they evidenced less con- 
fusion on revisions, interjections, 
phrase repetitions and word repeti- 
tions than on syllable repetitions and 
prolongations; hence, the subjects 
were more positive that revisions, in- 
terjections, phrase repetitions and 
word repetitions were representative 
of normal speech and less positive on 
syllable repetitions and prolongations. 
Thus, confusion or inconsistency, 
although varying in degree, existed on 
all categories of interruptions regard- 
less of the order of instructions. 

In considering these inconsistency 
data, it is important to be aware again 
of the bias introduced by the order 
of instructions. Inconsistency was 
greater for Group I when marking 
normal interruptions and greater for 
Group II when marking stuttered in- 
terruptions. The bias apparently oper- 
ated in the following fashion: Group I 
initially classified a large proportion 
of the interruptions as stuttering; 
therefore, a large number of the inter- 
ruptions subsequently judged to be 
normal had previously been marked as 
stuttering, thus increasing Group I’s 
Index of Confusion under instructions 
to mark normal interruptions. Group 
II, on the other hand, initially classi- 
fied a large proportion of the inter- 
ruptions as normal; therefore, a large 
number of the interruptions sub- 
sequently marked as stuttering had 
previously been marked as normal, 
thus increasing Group II’s Index of 
Confusion under instructions to mark 
stuttered interruptions. 


The data were also analyzed to 
determine whether the two groups, 
regardless of the order of instructions, 
responded similarly to the eight cate- 
gories of interruptions from most to 
fewest inconsistent responses. A rank- 
difference correlation coefficient was 
computed between the Indexes of 
Confusion on the eight categories of 
interruptions obtained for (a) Group 
I when instructed to mark stuttered 
interruptions and (b) Group II when 
instructed to mark stuttered interrup- 
tions. A correlation coefficient of .90 
was obtained between the two sets of 
indexes. When instructed to mark 
stuttered interruptions, subjects in 
both groups tended to have fewest in- 
consistent responses on syllable repeti- 
tions and prolongations and most in- 
consistent responses on revisions. 

A rank-difference correlation coeffi- 
cient was also computed between the 
Indexes of Confusion on the eight 
types of interruptions obtained when 
(a) Group I was instructed to mark 
normal interruptions and (b) Group 
II was instructed to mark normal in- 
terruptions. The correlation coeffi- 
cient obtained here was .85. When 
instructed to mark normal interrup- 
tions, subjects in both groups tended 
to have fewest inconsistent responses 
on revisions and most inconsistent re- 
sponses on syllable repetitions and 
prolongations. j 

It may be concluded that the order 
of the eight categories when ranked 
from most to fewest inconsistent re- 
sponses was similar for the two groups 
when given the same instructions re- 
gardless of the order of instructions. 


Discussion 


The subjects in this study demon- 
strated attitudes toward the different 





130 JOURNAL OF SPEECH AND HEARING RESEARCH 


types of nonfluencies that have im- 
plications both theoretically and clini- 
cally. Some types were more often 
considered to be stuttering; other 
types were thought to be essentially 
normal, still other types shifted back 
and forth from one classification to 
the other, depending upon the set of 
the listener. 


It is difficult to reconcile these find- 
ings by an examination of the charac- 
teristics of the nonfluency itself. Rea- 
sonably, there is no apparent basis for 
assuming that any one type of non- 
fluency is any more undesirable than 
any other one. The fact that the sub- 
jects judged certain types to be diag- 
nostically different, that is, ‘stuttered’ 
or ‘normal,’ would seem to reflect an 
attitude of the society in which they 
live, rather than any basically deviant 
characteristic of the nonfluency. 


Syllable repetitions and prolonga- 
tions were more consistently identified 
as ‘stuttering.’ This fact is in agree- 
ment with other experimental findings 
(1, 2). Also, it has been observed clin- 
ically many times that the existence of 
syllable repetition in a child’s speech 
is considered as evidence that he is a 
‘stutterer.’ This raises an important 
consideration. Does a child repeat 
syllables, for example, because he is a 
‘stutterer,’ or is he considered to be a 
‘stutterer’ because he repeats syllables? 


On the basis of this study, it is sug- 
gested that the cause and effect rela- 
tionship may work to a degree in 
both directions. Apparently, he could 
become known as a ‘stutterer’ because 
he repeats syllables. Once, however, he 
is identified as a ‘stutterer,’ then his 
word and phrase repetitions, inter- 
jections and even revisions, to a cer- 
tain extent, also come to be considered 
as ‘stuttering,’ seemingly for no other 


reason than because he is now thought 
of as a ‘stutterer.’! 


Summary 


When subjects were instructed to 
listen to a recorded speech and mark 
stuttered interruptions, they marked 
many of the same interruptions that 
they marked when instructed to 
listen to the same speech and to mark 
normal interruptions. They tended to 
‘hear’ what they were instructed to 
listen for at the time. 


The group of subjects instructed to 
mark stuttered interruptions _ first 
marked more interruptions as stuttered 
than they subsequently marked as 
normal. Conversely, the group in- 
structed to mark normal interruptions 
first marked more as normal than they 
subsequently marked as stuttered. 


Of the types of interruptions used 
in this study, subjects in both groups 
tended to respond most consistently, 
relatively speaking, to syllable repeti- 
tions, prolongations and revisions; sy]- 
lable repetitions and _prolongations 


‘One of the objectives in counseling the 
parents of a child who is considered to be 
‘stuttering’ is to help them re-evaluate some 
of the child’s speech interruptions. It has 
proved profitable to the present authors, at 
least, first to acquaint the parents with the 
concept of normal speech and then to ask 
them to pay particular attention to and 
to keep track of the normal interruptions in 
their child’s speaking behavior. Often, the 
parents, when faced with the task of listen- 
ing for and noting normal interruptions, not 
only become more interested in what con- 
stitutes a normal interruption, but begin 
classifying more of the interruptions as 
‘normal.’ As a consequence, the number of 
interruptions reacted to as ‘stuttering’ is re- 
duced. 





[ 


























eo 





WILLIAMS AND KENT: EVALUATIONS OF SPEECH INTERRUPTIONS 131 


were most consistently responded to as 
stuttered interruptions while revisions 
were most consistently responded to 
as normal. 


Acknowledgments 


The assistance of James Bost, gradu- 
ate assistant, Indiana University, and 
of Van C. Kussrow, Instructor of 
Speech, Valparaiso University, in the 
collection of the data is gratefully 
acknowledged. 


References 


nN 


we 


BoeuMter, R. M., A quantitative study 
of the extensional definition of stuttering 
with special references to the audible 
designata. Ph.D. Dissertation, State Uni- 
versity of Iowa, 1953. 


. Giotas, T. G. and Wituams, D. E., 


Children’s reactions to nonfluencies in 
adult speech. JSHR, 1, 1958, 86-93. 


. Turn, C. E., A quantitative study of 


extensional meaning with special refer- 
ence to stuttering. Speech Monographs, 
13 (1), 1946, 81-98. 


. West, Rosert, ANsBeRRY, M., and Carr, 


A., The Rebabilitation of Speech. New 
York: Harper and Brothers, 1957 (3rd 
edition). 





Listener Responses To Non-Fluencies 


Richard M. Boehmler 


This experiment was concerned with 
studying speech responses which are 
extensionally designated as stutter- 
ing behavior. The main purposes were 
to investigate (1) the relationship be- 
tween the rated severity of moments 
of non-fluency and the behavior of 
judges in labeling these speech phe- 
nomena as stuttering and (2) the re- 
lationship between the training of 
the judges and their behavior in the 
labeling process. Another purpose was 
to evaluate relationships between 
types of non-fluency and the behavior 
of judges in the labeling process. 


Procedure 


Selection of Speech Samples. 
Samples of non-fluencies were se- 
lected from short tape recordings of 
the speech of 90 college students, 60 
non-stutterers and 30 stutterers. Each 
speech sample was approximately five 
seconds in length and contained only 
one moment of non-fluency. The se- 
lection was made to obtain a wide 
range of severity and different types 
of non-fluencies. The total selection 
consisted of 804 samples, 402 from 





Richard M. Boehmler (Ph.D., State Uni- 
versity of Iowa, 1953) is Assistant Pro- 
fessor of Speech Pathology and Audiology, 
Humboldt State College. This article is 
based on a doctoral dissertation completed 
under the direction of Professors Wendell 
Johnson and Dorothy Sherman. 


the speech of stutterers and 402 from 
the speech of non-stutterers. 

The 804 samples were presented in 
random order to a group of 32 ele- 
mentary psychology students. They 
rated the severity of each non- 
fluency on a seven-point equal-ap- 
pearing intervals scale extending 
from one, for least severe, to seven, 
for most severe. A median scale value 
was obtained for each sample by the 
method described by Thurstone and 
Chave (6). Each of the two groups of 
samples was then divided into three 
sub-groups according to the obtained 
severity ratings. Samples with large 
Q-values or with median scale values 
near the points of division between 
two sub-groups were eliminated. The 
final selection consisted of six sub- 
groups of 100 samples each. The 
means of the obtained median scale 
values of the six sub-groups were as 
follows: 1.90 for mild samples from 
the speech of non-stutterers, 2.95 for 
average samples from the speech of 
non-stutterers; 4.13 for severe samples 
from the speech of non-stutterers; 
2.45 for mild samples from the speech 
of stutterers; 4.20 for average samples 
from the speech of stutterers; 6.08 for 
severe samples from the speech of 
stutterers. 


Judging of the Speech Samples. 
Three groups of judges which dif- 
fered from each other in degree and 
in kind of training in stuttering 


——132——- 





 —— 


= ee 








———— 


———ERr— 











BOEHMLER: LISTENER RESPONSES TO NON-FLUENCIES = 133 


TaBLE 1. Summary of analysis of variance for evaluation of the frequency of stuttering label data. 











Source of Variation df ss ms hui pt 
Between Subjects 29 25227 .36 
Judges (J) 2 9801.35 4900.68 8.52 .005 
error (0) 27 15536 .01 575.41 
Within Subjects 150 157092. 17 
Severity (S) 2 57033 .75 28516 . 87 546.51 .001 
Origin (O) 1 76590 . 94 76590 . 94 975.93 -001 
so 2 12530.52 6265.26 112.76 001 
SJ 4 798.18 199.54 15.29 001 
OJ 2 384.20 192.10 2.45 NStt 
sos + 11618.84 2904.72 52.28 .001 
error (w) 135 7937 .12 58.79 i 
error; (w) 54 2817.80 52.18 
errorg (W) 27 2119.03 78.48 
errors (Ww) 54 3000.29 55.56 
Total 179 182429 .53 








* vee . 
F-ratios: msy/mSerror(b);_ MSg/MSerrory (w); MSO/MSerrorg(w); MS$O/MSerrorg (w); 
msgy/MSerror (w); mMsSoJ/MSerrorg(w); MsgoJ/MSerrorg(w) 


tp = point in the F-distribution 
TTNot significant 


theory and therapy were selected. 
One group (Trained A) consisted of 
10 graduate students in speech path- 
ology who had had at least three 
semester hours of training in the area 
of stuttering. and clinical experience 
with at least one stutterer at the State 
University of Iowa. Another group 
(Trained B) consisted of 10 staff 
members at the Institute of Logope- 
dics who had had at least as much 
training and experience as the mem- 
bers of the Trained A group. The 


third group (Untrained) consisted of 
10 elementary psychology students 
who had had no training in speech 
pathology. 

The 600 samples were presented in 
random order by tape playback to 
each of the three groups of judges at 
three separate listening sessions. The 
judges were instructed to record S 
for each sample which seemed to con- 
tain an example of ‘stuttering’ and N 
for each sample which seemed to con- 


TABLE 2. Mean frequencies of the stuttering label for six sub-groups of 100 non-fluencies each, 
judged by three groups of observers (1. Trained A, 2. Trained B and 3. Untrained). 











Origin Sub-groups Judges ~ General Means 
2 3  (Sub-groups) (Origins) 

Mild 12.8 27.0 10.9 16.9 

Non-stutterers Average 20.7 35.5 14.2 23.5 27.5 
Severe 43.8 54.5 28.2 42.2 
Mild 31.7 49.2 22.9 34.6 

Stutterers Average 76.5 83.9 65.4 75.3 68.6 
Severe 98.3 95.3 95.7 96.4 

(General Means) 47.3 57.6 39.6 











134 JOURNAL OF SPEECH AND HEARING RESEARCH 


— Stutterers’ Non-fluencies 
--- Non-stutterers’ Non-fluencies 





3 


8 
T 


te) 
T 


> 
9 
’ 


$ 
T 


iJ 
So 
T 


Pid ¢ 
-°_-#% %X Trained A Judges 
© Trained B Judges 
© Untrained Judges 


Frequency of Stuttering Label 


S 
T 





1 L 1 1 1 1 aul 
.e) 1 2 3 4 5 6 7 


Scale of Rated Severity of Non-fluencies 





Figure 1. The trends of mean frequencies 
of the use of the stuttering label as a func- 
tion of rated severity. Each plotted point 
represents the mean for 100 speech samples. 


tain an example of ‘non-stuttering’ 
non-fluency. 


Results 


Frequency of Labeling. The meas- 
ure for the frequency of labeling was 
the number of non-fluencies within 
any one sub-group of 100 speech 
samples labeled by one judge as ‘stut- 
tering.’ The obtained data were eval- 
uated by an analysis of variance de- 
scribed by Lindquist (5) and identi- 
fied as Type VI. The analysis in- 
cluded three factors: (1) training of 
judges (Trained A, Trained B and 
Untrained); (2) origin of non-fluen- 
cies (from the speech of stutterers 
and from the speech of non-stutter- 
ers); and (3) severity of non-fluencies 
(mild, average and severe). A sum- 
mary of the analysis is presented in 


Table 1. Table 2 gives the mean fre- 
quency of labeling for each of the 
18 combinations among classifications. 
The trend of frequency of labeling as 
a function of rated severity is rep- 
resented graphically’ for each of the 
three groups of judges for each origin 
in Figure 1. 

Labeling in Relation to Groups of 
Judges. The judges differed from 
group to group in the mean fre- 
quency of application of the stutter- 
ing label. The trained B group did the 
most labeling, with a mean of 57.6 
out of a possible 100. The Trained 
A group was next with a mean of 
47.3. The Untrained group did the 
least amount of labeling with a mean 
of 39.6. The differences among these 
means were highly significant (See 
Table 1). These results provide evi- 
dence that individuals who differ in 
the degree and kind of training in 
speech pathology are likely to differ 
also in the frequency with which they 
apply the stuttering label. This con- 
firms’ the finding by Tuthill (7) who 
reported that clinicians did more 
labeling than non-clinicians. Both the 
interaction of severity with training 
and the interaction of origin with 
training effects are significant. In 
other words, the differences among 
the groups uf judges vary from one 
sub-group of non-fluencies to another 
and also from the speech of stutterers 
to the speech of non-stutterers. How- 
ever, an examination of the data 
shows that the same rank order is re- 
tained for all groups of judges for the 
two origins and for all six sub-groups 


"The graphical representation does not 
correspond exactly to the statistical analysis 
in that the abscissa represents scale values of 
rated severity. For purposes of statistical 
analysis the differences evaluated are be- 
tween corresponding mild, average and 
severe categories. 





| 
| 





ee 


——— 





of non-fluencies with the single ex- 
ception of the severe samples from 
the speech of stutterers. In this in- 
stance the differences are very small. 
The change in rank order can thus 
be accounted for by the fact that the 
maximum possible frequency of label- 
ing was being approached with high 
agreement among groups. 

Labeling in Relation to Severity of 
Samples. The differences among the 
mean frequencies of labeling for the 
three severity sub-groups of non- 
fluency samples were highly signif- 
icant for both origins (stuttering and 
non-stuttering) combined. The rank 
order indicated that the more severe 
the sample of non-fluency, the more 
likely it is to be labeled as stuttering. 
The rank order was from mild to 
average to severe for all groups of 
judges and for both origins of the 
samples. Apparently, severity of the 
non-fluency being judged is a major 
determinant for application of the 
stuttering label. 

Labeling in Relation to Origin of 
Non-Fluencies. The stuttering label 
was applied more frequently to the 
samples taken from the speech of 
stutterers than to those taken from 
the speech of non-stutterers, as shown 





BOEHMLER: LISTENER RESPONSES TO NON-FLUENCIES 135 


in Figure 1. The differences between 
the two origins varied somewhat with 
the level of severity, but in all cases 
the samples in the sub-groups from 
the stutterers were labeled as stutter- 
ing more frequently than were the 
samples in the corresponding groups 
from the non-stutterers. This differ- 
ence can be accounted for in part by 
the generally greater severity of the 
samples from the stutterers. However, 
Figure 1 also shows that the stutter- 
ing label was more frequently ap- 
plied to samples from the speech of 
stutterers than to samples from the 
speech of non-stutterers even when 
the samples from the two origins 
were of approximately equal mean 
severity. For example, although com- 
parable in rated severity, the average 
samples of non-fluency from the 
speech of stutterers were more fre- 
quently labeled .as stuttering than 
were the severe samples of non- 
fluency from the speech of non-stut- 
terers. Apparently factors other than 
the rated severity of the moment of 
non-fluency influence the labeling 
process. 

Labeling in Relation to the Type 
of Non-Fluency. A fourth factor, type 
of non-fluency, was investigated by 


TaBLE 3. Frequency of each type of non-fluency within each sub-group of experimental speech 











samples. 
‘Pom the S wary of From the S: Speech « 
Types of Non-fluencies Non-stuiterers Stutterers 
Mild Average Severe Mild Average Severe 

Interjections 6 7 11 5 8 9 
Repetitions of Sounds or 

Syllables 32 17 19 24 25 
Repetitions of Words 27 19 11 12 11 5 
Repetitions of Phrases 9 10 12 3 0 
Revisions 7 17 5 2 0 0 
Prolongations 5 2 0 13 4 6 
Mixtures 14 26 57 21 39 55 
Others 0 3 5 16 11 0 


Total 








136 JOURNAL OF SPEECH AND HEARING 


means of a chi-square test of inde- 
pendence. The categories, or types of 
non-fluencies, were as follows: (1) 
interjected sounds, instances of any 
extraneous sound such as ‘uh’ or ‘er’ 
which was distinct from sounds as- 
sociated with the repetitions of an 
initial sound or syllable; (2) repeti- 
tions of sounds or syllables; (3) rep- 
etitions of single words; (4) repeti- 
tions of phrases, instances of any rep- 
etition of more than one word in 
which no revision was made in the 
course of the repetition; (5) revisions, 
instances in which the wording of a 
phrase was modified by changing at 


RESEARCH 


least one word with a resultant change 
in the meaning of the phrase; (6) 
prolongations, instances of any ap- 
parent prolongation of a sound at the 
beginning or end of a word, or with- 
in a word; (7) mixtures, instances of 
any complex moment of non-fluency 
which consisted of more than one 
type, such as a repetition of a sound 
within a repetition of a word; (8) 
others, instances of an explosive, a 
pause, a broken word, or an inter- 
rupted phrase (these occurred rela- 
tively infrequently and were grouped 
to avoid too small theoretical fre- 
quencies for a chi-square test). The 


TaBLE 4. Frequencies of non-fluency types in 400 samples of speech, 100 from each of four sub- 
groups, average samples and severe samples from the speech of non-stutterers and mild samples 
and average samples from the speech of stutterers. 











Type of Frequency Frequency Total 
Non-Fluency Below M edian* Above MM edian* 
Word Repetitions 24 29 53 
(26.5)T (26.5) 
Sound or Syllable 21 45 66 
Repetitions (33) (83) 
Phrase Repetitions 19 11 30 
(15) (15) 
Mixtures 68 75 143 
(71.5) (71.5) 
Revisions 18 6 24 
(12) (12) 
Interjections 20 11 31 
(15.5) (15.5) 
Prolongations 13 6 19 
( 9.5) (9.5) 
All Others 17 17 34 
(17) (17) 
Total 200 200 400 


Chi-square (df=7) = 21.83, significant beyond one per cent level. 








*The non-fluencies were dichotomized as Below Median and Above Median with reference to 
the number of judges applying the stuttering label. 


{Values within parentheses are the theoretical or expected frequencies. 











— 











— 








BOEHMLER: LISTENER RESPONSES TO NON-FLUENCIES 137 


distribution of samples among cate- 
gories is given in Table 3. 

To study the relationship between 
frequency of labeling and the type 
of non-fluency, it was necessary to 
control the influence of severity by 
using only four sub-groups. When all 
six sub-groups were included, the 
average rated severities of samples 
from non-stutterers and _ stutterers 
were 2.99 and 4.23, respectively. For 
the combined average and _ severe 
samples from the speech of non-stut- 
terers and for the combined mild and 
average samples from the speech of 
stutterers, the averaged rated sever- 
ities were approximately the same, 
3.54 and 3.31, respectively. The mild 
samples from the speech of non-stut- 
terers and the severe samples from 
the speech of stutterers were thus 
omitted from this part of the analysis. 

The measure was the number of 
judges who labeled a non-fluency as 
an example of stuttering. Since there 
were 30 judges, the measure could 
vary from zero to 30. The median 
of the measures for the 200 samples 
from the speech of stiterers was 16.5, 
and the median of the measures for 
the 200 samples from the speech of 
non-stutterers was 8.77. These median 
values were used as the reference 
points for dichotomizing the samples. 
If the number of judges labeling a 
sample as stuttering exceeded 16.5 for 
the samples from the speech of non- 
stutterers or 8 77 for the samples from 
the speech of stutterers, the sample 
was Classified as reported in Table 4 
as ‘Above Median.’ If the number of 
judges labeling a given sample fell 
below the appropriate reference value, 
it was classified as ‘Below Median.’ 

To test the hypothesis that the 
number of judges labeling a given 
sample was independent of the type 


of non-fluency, a chi-square was com- 
puted from the data presented in 
Table 4. A highly significant result 
indicated that the number of judges 
labeling a given sample as stuttering 
was related to the type of non-fluency 
contained within the sample. In the 
category of repetition of sounds and 
syllables, 45 samples (17 of 23 from 
non-stutterers and 28 of 43 from stut- 
terers) were above the medians and 
21 were below. The evidence thus 
indicates that samples containing 
sound or syllable repetitions are likely 
to be labeled as stuttering. This may 
be one reason that the non-fluencies 
from the speech of stutterers were 
more frequently labeled as stuttering 
than were the non-fluencies from the 
speech of non-stutterers. More of the 
samples from  stutterers contained 
sound or syllable repetition than did 
the samples from the non-stutterers. 

The mixed samples were approxi- 
mately equally distributed above and 
below the median frequency values 
of application of the stuttering label. 
However, 53 of the 143 mixed samples 
contained sound or syllable repeti- 
tions and 79 per cent of these 53 
samples were above the medians. Also, 
among the mild samples from the 
speech of non-stutterers, which were 
not included in the chi-square test, 
five of the six samples most frequently 
labeled as stuttering contained sound 
or syllable repetitions. Thus, there is 
some further evidence beyond that 
provided by the significant chi-square 
value that samples containing sound 
or syllable repetitions are likely to be 
labeled as stuttering. Furthermore, 
severity could be ruled out as an im- 
portant factor since the average se- 
verity, 3.43, of all samples in the four 
sub-groups was approximately the 
same as the average severity, 3.35, of 





138 JOURNAL OF SPEECH AND HEARING 


the samples containing sound and syl- 
lable repetitions. 

Revisions also may have contributed 
to the difference between samples 
from the speech of non-stutterers and 
those from the speech of stutterers 
with respect to the frequency of the 
stuttering label. As indicated in Table 
4, there were three times more re- 
vision samples below than above the 
median. As indicated in Table 3, there 
were 22 revision samples from the 
speech of non-stutterers as compared 
with two from the speech of stutter- 
ers. Other categories may also have 
contributed to the difference between 
the two origins, but small frequencies 
preclude further interpretation with 
regard to the importance of various 
types of non-fluencies. 

Table 4 shows that samples con- 
taining interjections were likely not 
to be labeled as stuttering. This ten- 
dency did not contribute importantly 
to the difference in the frequency of 
use of the stuttering label between 
the samples from the speech of non- 
stutterers and those from the speech 
of stutterers, since the samples were 
about equally distributed between the 
two origins (See Table 3). 

On the basis of the results obtained, 
the assumption may be made that 


RESEARCH 


types of non-fluency have some effect 
on the labeling process. Also war- 
ranted is the assumption that non- 
fluencies consisting of syllable or 
sound repetition are labeled as stut- 
tering more frequently than are non- 
fluencies of other types, and non- 
fluencies consisting of revisions or in- 
terjections are labeled less frequently. 


Extensional Agreement Index. The 
criterion measures used to quantify 
extensional agreement were obtained 
by the following formula: 


EAI = (2X-n)/n 


where n represents the number of 
judges in a group and X is the num- 
ber within a group who consider a 
particular sample of non-fluency to 
be an example of stuttering. This 
measure has been presented and dis- 
cussed by Johnson (4). 

Three measures of extensional 
agreement were available for each 
sample, one for each group of judges. 
The data were evaluated by the same 
method used for the frequency of 
labeling. The analysis included the 
same three factors: (1) training of 
judges (Trained A, Trained B and Un- 
trained); (2) origin of non-fluencies 


TaBLE 5. Mean extensional agreement indexes for application of the stuttering label for six sub- 
groups of 100 non-fluencies each, judged by three groups of observers (1. Trained A, 2. Trained B 


and 3. Untrained). 








Sub-groups 





Origin Judges General Means 
Q 3 (Sub-groups) (Origins) 

Mild 76.8 56.6 76.6 70.0 

Non-stutterers Average 63.6 45.2 72.2 60.0 60.0 
Severe 48.0 44.2 57.4 49.9 
Mild 51.2 47.0 62.2 53.5 

Stutterers Average 64.4 75.2 52.0 63.9 70.7 
Severe 97.8 94.0 92.4 94.7 

(General Means) 67.0 














es 








BOEHMLER: LISTENER RESPONSES TO NON-FLUENCIES 139 


Taste 6. Summary of analysis of variance for evaluation of extensional agreement index data. 














Source of Variation df ss ms F* pt 
Between Samples 599 1177697 .78 
Severity (S) 2 43832 .45 21916 .22 22.4 .001 
Origin (O) 1 51842 .00 51842.00 52.9 001 
so 2 326689 . 33 163344. 66 166.8 001 
error (0) 594 581596 . 66 979.12 
Within Samples 1200 833866. 89 
Judges (J) 2 22935.78 11467 .87 19.3 001 
JS 4 14528 .79 3632.19 6.1 001 
JO 2 40758 .67 20379 .33 34.2 .001 
JOS 4 $7960.33 11990 .08 20.1 .001 
error (w) 1188 707683 .32 595.68 
Total 1799 1899413.12 








*F-ratios: msg/mSerror(b); MSQ/MSerror(b); MSSO/MSerror(b); MSJ/MSerror(w); 
msyg/MSerror(w); msyo/MSerror(w); msjog/MSerror(w) 
Tp = point in the F-distribution 





tS 


(from the speech of stutterers and 
from the speech of non-stutterers); 
and (3) severity of non-fluencies 
(mild, average and severe). Table 5 
presents the mean extensional agree- 
ment index for each of the 18 com- 
binations among the factor classifica- 





tions. A summary of the analysis is 
given in Table 6. All F-test results 
were highly significant. The relation- 
ships among the factors are apparently 
complex, and the significant interac- 
tions preclude any very meaningful 
interpretation of the results of the 
analysis. Examination of the means in 


as Se ; pore Table 5, however, makes apparent that 
setae aden tae teehee ¥ judges agreed best on the severe non- 

90/- fluencies of stutterers and next best 
5 on the mild non-fluencies of non- 
2 Or stutterers. They agreed least on the 
= severe non-fluencies of non-stutterers, 
— 7r with the next poorest agreement on 
5 “AVERAGE the mild non-fluencies of stutterers. 
3 Figure 2 shows the average exten- 
8 sional agreement index plotted against 
s 7 the corresponding frequency of ap- 
a 5 Reset 6 seers plication of the stuttering label for 
‘- « Untreihed eau each of the 18 combinations of groups 








1 it i 1 1 1 1 1 1 J 
° 10 20 30 40 50 60 70 80 90 100 
Frequency of Stuttering Lobel 


Ficure 2. The trends of mean frequencies 
of the use of the stuttering label as a func- 
tion of the extensional agreement index. 
Each plotted point represents the mean for 
100 speech samples. 


of speech samples with groups of 
judges. Examination of the figure 
makes readily apparent that the trends 
over groups of judges were highly 
similar for four groups of speech 
samples—the mild, average and severe 
non-stuttering samples and the mild 





140 JOURNAL OF SPEECH AND HEARING RESEARCH 


stuttering samples. In each instance 
the Untrained Judges applied the 
s.attering label least frequently and 
had the highest agreement while the 
Trained B Judges applied the label 
the most frequently and had the low- 
est agreement. The trend is the re- 
verse for the average stuttering 
samples, the groups remaining in the 
same order with respect to frequency 
of application of the stuttering label 
but with higher agreement accom- 
panying increase of frequency. For 
the severe stuttering samples, as 
might be expected, both the agree- 
ment and the frequency of labeling 
are quite high for all three groups of 


judges and any trend appears negli- 
gible. 


Discussion 


The positive relationship between 
the frequency of the label ‘stutter- 
ing’ and the severity of the moments 
of non-fluency is not surprising. Dif- 
ferences among types of non-fluencies 
with equal severity, however, require 
explanation. At least one plausible ex- 
planation for the tendency to label 
sound and syllable repetitions as stut- 
tering is available. If a judge inten- 
sionally defines stuttering as non- 
fluencies which involve difficulty in 
saying words, and regards non-fluen- 
cies involving difficulty in express- 
ing ideas as normal, then it is plausible 
that he would associate sound and syl- 
lable repetitions with difficulty in 
saying a word rather than with dif- 
ficulty in formulating an idea, and 
so would be more likely to classify 
this type of non-fluencies as stutter- 
ing. On the other hand, revisions and 
interjections could plausibly be as- 
sociated with difficulty in expressing 
ideas and to the degree that this is 


true they would not be labeled as 
stuttering. 


It has been found that both adults 
and children who have been diag- 
nosed as stutterers present propor- 
tionately more sound and syllable 
repetitions than do non-stutterers (J, 
2, 3). Also, sound and syllable repeti- 
tions, according to the results of the 
present study, are relatively likely to 
be labeled ‘stuttering.’ The impor- 
tance, however, of this labeling re- 
sponse in the original classification of 
individuals as stutterers is yet to be 
investigated. Three important ques- 
tions have yet to be answered: (1) 
Are sound and syllable repetitions 
more or less likely to be labeled stut- 
tering when they occur than when 
they do not occur in a context of 
meaningful connected speech? (2) 
Are perceived instances of non- 
fluency the chief determinant of the 
classification of a child as a stutterer? 
(3) Are there proportionately more 
sound and syllable repetitions in the 
speech of children classified as stut- 
terers at the time when they are 
originally so classified than in the 
speech of children of comparable age 
and development who are not clas- 
sified as stutterers? If all three of the 
above questions were to be answered 
in the affirmative, then the signifi- 
cance of sound and syllable repeti- 
tions in contributing to the classifica- 
tion of given speakers as stutterers 
would be established. 

The results of this study confirm 
an earlier report by Tuthill (7) that 
judges trained in speech pathology do 
more labeling of speech phenomena 
as stuttering than do _ untrained 
judges. In the present study the 
judges knew when each example of 
non-fluency was to occur. Apparently 
the untrained judge, even when he is 
aware of a non-fluency, is less likely 








———e—r 














econ SE 














BOEHMLER: LISTENER RESPONSES TO NON-FLUENCIES 141 


to label it as stuttering than the 
trained judge. Training in speech 
pathology as represented by the 
judges employed in this study, may 
thus not only increase awareness of 
non-fluencies, but it may also de- 
crease tolerance for such speech phe- 
nomena. 


Summary 


Three groups of judges, two groups 
trained in speech pathology at two 
different institutions, respectively, and 
one group with no such training, 
classified each of 600 short speech 
samples as containing a stuttering 
non-fluency or as containing a non- 
stuttering non-fluency. Half of the 
speech samples were from the speech 
of stutterers and half were from the 
speech of non-stutterers. The samples 
had also been rated for severity of 
non-fluency by another group of 
listeners. 

The frequency with which the 
judges applied the stuttering label 
varied with the rated severity of the 
samples. Trained judges applied the 
label more often than untrained 
judges. Sound and syllable repetitions 
were labeled as stuttering more often 
than revisions and interjections, re- 
gardless of rated severity. Judges, in 
general, agreed best on the severe 
stuttering non-fluencies and next best 


on the mild non-stuttering non- 
fluencies. The lowest agreement was 
on the severe non-stuttering non- 
fluencies and the next lowest on the 
mild stuttering non-fluencies. 


Acknowledgement 


Co-operation of Dr. Martin F. Pal- 
mer, Director of the Institute of 
Logopedics, is acknowledged grate- 
fully. 


References 


1. Cesaretri, M., A study of the non- 
fluency of first grade children. M.A. 
Thesis, Humboldt State College, 1958. 


. Jounson, W. (ed.), Analysis of recorded 
speech samples, Chapter 8, The Onset 
of Stuttering. Minneapolis: University 
of Minnesota Press, (in press). 


nN 


JouNson, W., Normative studies of 
fluency. Unpublished research, Univer- 
sity of Iowa. 


w 


4. Jounson, W., Studies in language be- 
havior. I. A program of research. 
Psychol. Monogr., 56, 1944, 1-15. 


wm 


. Linvouist, E. F., Design and Analysis 
of Experiments in Psychology and Edu- 
cation. Boston: Houghton Mifflin, 1953. 


6. Tuurstone, L. L. and Cuave, E. L., 
The Measurement of Attitude. Chicago: 
University of Chicago Press, 1929. 


7. Turnit, C. E., A quantitative study of 
extensional meaning with special ref- 
erence to stuttering. Speech Monogr., 
13, 1946, 81-98. 





Two Voice-Message Storage Schemes 


Paul O. Thompson 
John C. Webster 
Roy G. Klumpp 
Walter F. Bertsch 


During periods of peak activity in a 
communication center such as a Com- 
bat Information Center (CIC) or Air 
Control Tower, an operator may be 
required to monitor several voice 
channels, and messages from these dif- 
ferent channels may overlap. The 
amount of information from voice 
messages which can be accurately 
processed by an operator is limited by 
the proficiency of the operator, the 





All authors are members of the Human 
Factors Division of the U. S. Navy Elec- 
tronics Laboratory, San Diego. Paul O. 
Thompson (M.A., University of Southern 
California, 1950) and Roy G. Klumpp 
(M.A., University of Wisconsin, 1951) are 
Research Psychologists in the Auditory De- 
tection and Communications Section of 
which John C. Webster (Ph.D., University 
of Iowa, 1953), Supervisory Psychologist, is 
Head. Walter F. Bertsch (M.A., Yale Uni- 
versity, 1957), Physicist, is a part-time mem- 
ber of the section and Ph.D. candidate at 
Yale. Some of the material in this article 
was orally presented June 19, 1956, in a 
session of the Fifty-first Meeting of the 
Acoustical Society of America and the 
Second Congress of the International Com- 
mission on Acoustics at the Massachusetts 
Institute of Technology. This experimenta- 
tion is also the subject of U. S. Navy BU- 
SHIPS Film Report No. 6-55, which 
can be obtained from the U. S. Navy Elec- 
tronics Laboratory, San Diego 52, Cali- 
fornia.. 


complexity of his response (0), the 
speech intelligibility (4) and the mes- 
sage density (11) of the messages to 
which he must respond. Controlling 
one or more of these factors may im- 
prove operator output. : 

Operator proficiency may be im- 
proved through training. The com- 
plexity of the operator’s response may 
be reduced by using improved opera- 
tional procedures such as the use of 
standardized phraseologies (5). Speech 
intelligibility may be increased by in- 
creasing the signal to noise ratio, by 
widening the frequency response band 
and by reducing distortion in the 
system (4, 8). During periods of high 
message density (when messages over- 
lap) the operator’s output may be in- 
creased by the use of a separate loud- 
speaker for each channel to be 
monitored, by avoiding as much ir- 
relevant chatter as possible and by dif- 
ferentially filtering certain channels 
(2, 3, 6, 7). But even with these aids 
an operator cannot effectively re- 
spond to two overlapping messages 
(10). 

In a competing message situation it 
is possible that errors and missed mes- 
sages (and subsequent requests for 
repeats) would be kept at a minimum 


—142— 








enna as 














ee 








THOMPSON, WEBSTER, KLUMPP AND BERTSCH: VOICE STORAGE 143 


if the operator in some way were 
able to store a competing message 
momentarily, and then listen to it 
when he was ready, i.e., to self-pace 
his listening by control of message 
spacing. The earliest experiment of 
this type done at the Navy Elec- 
tronics Laboratory showed that the 
use of a self-pacing storage aid in a 
competing message situation increased 
the number of words correctly re- 
ceived by one and one-half times (9). 
A subsequent experiment (1) evalu- 
ated two storage schemes which pro- 
vided different degrees of control over 
the sequence of voice messages in a 
busy multi-channel situation. In this 
experiment the operator transcribed 
onto a plotting board complex in- 
formation from messages which fre- 
quently overlapped. He was _per- 
mitted to ask the sender for repeats 
on missed messages. Full message- 
spacing control (complete self-pac- 
ing) was found to increase efficiency 
(more was transcribed correctly, few- 
er repeats were needed and delay time 
per repeat was reduced) while partial 
control was no better than no control. 

In an operational communications 
center, certain personnel not only 
have the task of receiving and/or 
transcribing information from com- 
peting voice messages but also are re- 
quired to consolidate the information 
and take appropriate action. For ex- 
ample, the CIC officer mentally tran- 
scribes and correlates information 
from a variety of sources, many of 
which are voice channels, and makes 
recommendations and decisions based 
on his consolidation of the informa- 
tion. 

In the previous experiment the com- 
parative benefit of message-spacing 
control in solving problems and/or 
making decisions was not tested. The 
present study, then, evaluates partial 


and full message-spacing control of 
competing voice messages in a task 
that not only included transcribing 
of message information but also solv- 
ing problems the information posed. 
The equipment and procedures were 
essentially the same as in the previous 
study (J) except for certain proce- 
dural changes necessitated by the ad- 
dition of problem-solving. 


Procedure 


The two message storage schemes 
were evaluated in a_ four-channel 
communication net with two-way 
communication between an operator 
and each of four talkers, but with no 
direct communication among the 
talkers. The operator transcribed in- 
formation from the talkers’ messages 
onto a plotting board. Message density 
varied from a 20-second lapse between 











Ficure 1. No-storage condition of the four- 
channel two-way communication net. When 
the message programmer caused a cueing 
light to be lighted, the cued talker (Moat, 
Rabbit, Stick, or Taxi) read a message to 
the operator (Whidbey). As shown, Moat 
would be reading a programmed message 
to Whidbey. Two-way communication al- 
lowed the operator to acknowledge mes- 
sages and to request repeats verbally from 
any talker. The operator could not mute 
interfering and/or irrelevant messages. 





144 JOURNAL OF SPEECH AND HEARING RESEARCH 








MESSAGE PROGRAMMER 
si ag! 9-SEG era 
STORE 

G CUEING 


LIGHTS 


ke 







b> STORE 


OPERATOR 


W 


Ficure 2. Fixed-storage condition. One 
channel of the communication net is shown. 
The message program drum first caused the 
direct talker’s (Taxip) cueing light to be 
lighted, and nine seconds later caused the 
storage-talker’s (Taxis) cueing light to be 
lighted. The operator could switch at will 
between the ‘Direct’ and ‘Store’ circuits in 
order to get mechanical repeats of messages 
or to mute unwanted messages. Two-way 
communication, which permitted the op- 
erator to acknowledge messages and re- 
quest repeats verbally, was provided only 
between the operator and the direct talker. 
The same programs of message time se- 
quence were used for all experimental con- 
ditions (Figures 1, 2, 3). 


messages to the presentation of three 
messages ‘simultaneously. As informa- 
tion from the messages accumulated 
on the plotting board, the operator 
was able to see interrelationships 
which allowed him to solve problems 
by consolidating the scattered infor- 
mation. Problems were solved under 
three conditions: (1) no-storage, in 
which the operator had no control 


over time of listening to received 
messages; (2) fixed-storage (partial 
control), which allowed the operator 
to store any message for a fixed 
amount of time (nine seconds); and 
(3) readout-on-demand-storage (full 
control), in which every message au- 
tomatically was stored until the op- 
erator could attend to it. In both 











L..MESSAGE PROGRAMMER ae 


awe + 


STEPPING | J — 

















SWITCH 4 
u fi; |\ op g 
cw MESSAGE 
INDICATING 
PUSH LIGHTS 







TO 
ati OPERATOR 





CUEING 


TALKER 


Ficure 3. The readout-on-demand-storage 
condition. One channel of the communica- 
tion net is shown. The programmer caused 
the stepping switch to move counterclock- 
wise (CCW), thereby connecting voltage 
to message-indicating lights in front of the 
operator. To hear a message the operator 
pressed the ‘Push to Listen’ button which 
(1) turned off the indicating light by back- 
stepping the stepping switch (clockwise), 
and (2) caused a cueing light to signal the 
talker to read a message. The operator 
could mute messages and verbally direct re- 
peat requests to any talker when necessary. 








ee 





re er 

















THOMPSON, WEBSTER, KLUMPP AND BERTSCH: VOICE STORAGE 145 


message storage conditions the storage 
was provided by simulation tech- 
niques. 

Figure 1 shows the four-channel 
communication net in the no-storage 
(control) condition. An automatic 
message programmer controlled cue- 
ing lights which indicated to four 
talkers (M, R, S, T) when they were 
to read messages over wire links to 
the operator (W). The same message 
programmer and programs were used 
for all conditions. 






-d 
£® 


> 


1}2)3!4|5|6}7/8)9 





I) lelslalslelzlele 


Figure 4. Quadrant (Position) III of the 
operator’s four-quadrant plotting board 
showing the necessary partial information 
and complete solution for two objects. In 
this problem the operator plotted in the 
margins..(1) row and _ direction inforina- 
tion from Moat, (2) column and color in- 
formation from Stick, (3) shape and di- 
rection information from Taxi, and (4) 
shape and color information from Rabbit. 
Interlocking information indicates. that the 
Circle which is Red .and. pointed North 
should be in cell G-6, while the East-di- 
rected, yellow, Plus should be in cell B-6. 
A. single quadrant might have from zero to 
four objects and each object could have 
any of four shapes, colors, or directions, 
and be in any one of the 400 cells (100 
cells, four quadrants). The total number of 
objects ranged. from six to eight over the 
six problems. 


Figure 2 shows one of the four 
channels of the communication net 
used to simulate the fixed-storage 
condition. This condition is different 
from the no-storage condition in that 
a second (storage) talker (Ts) re- 
peated each message nine seconds after 
its first reading. T's is a simulation of 
a storage device, so repeat requests 
were directed to Tp, the original 
talker. The operator could switch 
any channel (M, R, S, T) to its 
‘store’ circuit if he chose to listen to 
the second reading of a particular 
message. 


Figure 3 shows one channel of 
the readout-on-demand-storage simu- 
lation. The programmer activated a 
stepping switch which turned on mes- 
sage-indicating lights on the operator’s 
console. He could listen to the mes- 
sages by pressing a ‘push to listen’ but- 
ton which backstepped the stepping 
switch, turned off one message- 
indicating light and turned on the 
cueing light which signalled T to de- 
liver a message. 


The messages contained descriptive 
and positional information about spe- 
cific objects. This information was 
transcribed by the operator onto a 
four-quadrant plotting board, one 
quadrant of which is shown in Figure 
4. Each message contained an address, 
a station identification, the quadrant 
(‘Position’) involved, and two other 
pieces of information. A typical mes- 
sage read, ‘Whidbey, this is Moat; 
Position III, Row Baker, East, Over.’ 
The operator acknowledged by say- 
ing, ‘Whidbey, Roger, Out,’ and tran- 
scribed this message by plotting an 
arrow pointing East in Row B of 
Quadrant III. The operator received 
one message from each talker con- 
cerning each object, but he did not 
receive these messages in routine 
order. Messages about a particular 





146 JOURNAL OF SPEECH AND HEARING RESEARCH 


object were dispersed in time among 
messages concerning one or more 
other objects. When a message con- 
tained no row or column designation, 
it was noted in the corner of the 
quadrant to which it pertained. One 
such message was, ‘Whidbey, this is 
Rabbit; Position III, Plus, Yellow, 
Over.’ 


The operator acted only upon mes- 
sages addressed to Whidbey and iden- 
tified by the talkers as from ‘Moat,’ 
‘Rabbit,’ ‘Stick,’ or ‘Taxi.’ Some of 
the programmed messages were ‘duds’ 
which could be recognized as such 
only by an irrelevant address and/or 
station identification. The number of 
relevant messages per problem aver- 
aged 30 as compared to an average of 
five and one-half ‘duds.’ 


From the interlocking information 
in the messages transcribed to the mar- 
gins of the quadrant shown in Figure 
4, Whidbey (the operator) can con- 
clude there are two objects, both of 
which are in Column 6; in Row B, 
an East-directed Plus which is Yel- 
low and in Row G, a North-directed 
Circle which is Red. The operator 
knew in advance that an object could 
have one of each of four shapes, four 
directions and four colors, and could 
be located in any of 400 cells. If the 
restrictions placed on these alternatives 
by the coding of the interlocking 
messages are disregarded, the operator 
is calculated to have made a choice 
among 25,600 alternatives, or a 14.6 
bit decision (14.6 = log, 25,600) in 
order to plot an object correctly. On 
the average there were seven and 
one-half objects per problem. 


The operators were instructed to 
solve each problem as quickly as pos- 
sible. They were informed of the 
basic performance criteria. These 
criteria were (1) the degree to which 
the problem was completed correctly 


by the end of the last programmed 
message and (2) how much additional 
time was required to complete the 
problem solution. Performance was 
also analyzed in terms of the number 
of messages repeated due to the op- 
erator’s verbal requests and the delay 
time for each such repeat. The opera- 
tors were permitted to ask for mes- 
sage repeats at any time. A typical re- 
peat request during the running of a 
message program was, ‘Moat, this is 
Whidbey; say again; Over.’ Moat’s 
response was a single repeated (re- 
read) message. Data on repeats were 
obtained by tape-recording the test 
sessions. These recordings also aided in 
analyzing the use of the fixed-storage 
provision and the delay in storage 
under the readout-on-demand con- 
dition. 


At the end of the program time for 
the last message the operator was told 
to step aside for a moment and a 
photograph of the board was taken 
for later analysis of the status of mes- 
sage transcription and object consol- 
idation at this point. After this inter- 
ruption of about five seconds the op- 
erator continued to work until his 
solution was complete and correct. 
During this time he corrected omis- 
sions and inaccuracies and consoli- 
dated objects by requesting repeats 
such as, “Taxi, this is Whidbey; re- 
peat all messages on Position IV; 
Over.’ Taxi then reread his messages 
about this quadrant in the original 
programmed order until he had re- 
peated each one or was stopped by 
the operator. Such a request resulted 
in repetition of many messages which 
had already been transcribed in ad- 
dition to the one or more that the 
operator was seeking, but it was the 
most expeditious method of getting 
the missing information. 

The average speaking time per mes- 
sage was four and one-half seconds, so 




















————————— 


a a 





THOMPSON, WEBSTER, KLUMPP AND BERTSCH: VOICE STORAGE 147 


that, if no message overlap had oc- 
curred, two minutes and 40 seconds 
of the average five minutes per prob- 
lem would have been utilized by the 
programmed messages. Since many of 
the messages overlapped, the actual 
speaking time (exclusive of repeats 
and acknowledgements) was two 
minutes and 24 seconds for the prob- 
lem with the least amount of overlap, 
and two minutes and four seconds for 
the problem with the greatest amount 
of overlap. Problem difficulty varied 
according to the number of objects 
about which the operator received 
messages, the over-all message density 
and the amount of competition (over- 
lap) between messages. 


Six problems (180 relevant mes- 
sages specifying 45 objects) were 
solved by each of eight operators 
under each of the three experimental 
conditions (no-storage, fixed-storage, 
readout-on-demand-storage). Three 
quasi-equivalent sets of the six prob- 
lems were generated so that the op- 
erators would not realize they were 
solving the same basic problems under 
each of the three conditions.? During 
a particular test session an operator 
solved problems under only one ex- 
perimental condition. The six per- 
mutations of order for the three ex- 
perimental conditions were distributed 
among the eight operators. The two 
permutations that were used twice 
were (1) readout-on-demand-storage, 
no-storage, fixed-storage and (2) no- 
storage, readout-on-demand-storage, 
fixed-storage. All of the operators 


‘Equivalent forms of the original six 
problems were generated by changing ori- 
entation of the object patterns and the 
colors, shapes and directions of the objects. 
Homologous messages were presented in 
the same time sequence for all forms of 
the same test. 


were experienced in listening and plot- 
ting and were considered excellent 
plotters by the Navy. Each operator 
was given a four-hour training period 
to acquaint him with the experimental 
conditions. 


To summarize, each experimental 
condition was evaluated on four cri- 
teria: (1) the number of errors (mes- 
sages from which information was in- 
correctly transcribed or omitted) be- 
fore the scheduled end of the last pro- 
grammed message; (2) the number of 
repeats due to verbal requests by the 
operator (a) during and (b) after the 
last programmed message; (3) the 
number of object consolidations not 
completed by the end of the message 
program; and (4) the additional time 
required for total completion of the 
object consolidation. The number of 
stored messages and associated delays 
were also evaluated for the two stor- 
age conditions. 


Results 


Table 1 shows the results of the 
operators’ performance up to the end 
of the last programmed messaye in 
terms of (1) transcription errors, (2) 
repeats and associated delays, (3) 
stored messages and associated delays 
and (4) objects unfinished. Percent- 
ages in the first three column headings 
are based on 1440 messages (8 opera- 
tors x 6 problems x 30 messages per 
problem) and in the last column for 
360 objects (8 x 6 x 7.5). In paren- 
theses are the results in percentages 
from the earlier experiment (/) where 
a different group of operators used 
three of the six problems used here 
and their task was one of transcrip- 
tion, object consolidation not being 
required. 

In both experiments the errors 





148 JOURNAL OF SPEECH AND HEARING RESEARCH 


TaBLE 1. Results in terms of data recorded during and at the end of the message program com- 
pared to similar data (in parentheses) from the previous experiment in which operators were required 
only to transcribe message information(1). Average delay times are given in seconds. 











Transcription Repeated Stored Objects 
Errors Messages Messages Unfinished 
N_ Per Cent N Per Cent Delay N_ Per Cent Delay Per Cent 
Readout on 76 5.3 10” 4057, 9.0 1440 100 9 53.1 
Demand (0.8) (3.3) (8.7) (100) 
Storage 
Fixed 248 17.2 2 =#1.7 14.5 107: EG 9 63.3 
Storage (16.6) (6.7) (14.8) (6.9) (9) 
No 187 13.0 195 138.5 14.2 59.7 
Storage (15.4) (14.7) (14.4) 








(messages not transcribed or incor- 
rectly transcribed) are considerably 
fewer in the readout-on-demand con- 
dition than in the other two con- 
ditions. The differences between ex- 
periments are slight except for the 
readout-on-demand condition, where 
the difference is due mostly to a 
procedural change, an artifact of 
when the errors were counted. In the 
previous experiment, errors were 
counted after all messages had been 
heard; in this experiment they were 
counted after the last message was 
programmed. Since the operator was 
consolidating objects as he went 
along, he occasionally would have 
from one to three messages in stor- 
age at the time errors (which included 
omissions) were recorded. By di- 
viding the 76 missed messages for 
the readout-on-demand condition by 
eight operators and six problems, an 
average of one and six tenths messages 
per operator per problem is obtained. 
The large majority of these messages 
were not transcribed because they 
had not been heard as yet. If the 
error score is computed 15 seconds 
after the last programmed message, 
these stored messages have been read 
out and the error score drops to less 
than one per cent. The interest- 


ing fact is that the addition of the 
problem-solving (object-consolida - 
tion) task increases the response com- 
plexity of the operator and conse- 
quently an increase in errors would be 
expected (10). However, the object- 
consolidation task also made errors es- 
pecially costly since the operator 
could not terminate this far from 
pleasant task until he had correctly 
consolidated all objects, which he 
knew he could not do until all mes- 
sages were correctly transcribed, 
whereas in the previous experiments 
transcription errors did not lengthen 
the experimental session. Apparently 
this completion requirement kept the 
operator at peak performance and 
offset the increased error tendency 
previously noted (J0) for more com- 
plex responses. 

Table 1 also shows some interest- 
ing results on the number of repeated 
messages. First, both readout-on-de- 
mand and fixed-storage considerably 
reduced the percentages of repeats re- 
quested, especially in this experiment 
(compare 0.7 and 1.7 with 13.5). This 
reduction is important in a practical 
sense because repeated messages re- 
duce available channel time. Second, 
in considering both the repeated and 
stored messages, note that 13.3 (13.6) 








EE ee 























THOMPSON, WEBSTER, KLUMPP AND BERTSCH: VOICE STORAGE 149 


3 TUE TO FINISH PROBLEMS (MINUTES) 
@ OBJECTS UNFINISHED (PER CENT) 
dy TRANSCRIPTION ERROR (PER CENT) 


READOUT. FIXED NO 
ON-DEMAND STORAGE STORAGE 








4.07 MIN. 





kel 
4.0 MIN, 








s 





PER CENT 
.-) 
| 
| 
MINUTES 





0 xe ——| f—?2 
1.99 MIN. 














a 


























REPEATS REPEATS REPEATS: 
Sa. ® S&S et ae 








Ficure 5. Added time to finish the problem 
for each of the three experimental con- 
ditions, object consolidation unfinished at 
the end of program generation, transcription 
errors and repeats due to verbal requests. 
When the reference is to messages (as for 
errors or repeats), the percentages are based 
on a total of 1440; when the reference is to 
objects, the percentages are based on 360. 
The time-to-finish values represent the 
average time from the end 5 a message 
program to the end of the problem solu- 
tion. For each condition the left vertical 
bar represents those repeats requested dur- 
ing (D) the message program, the middle 
bar those repeats coming after (A) the 
end of the program. The shaded sections 
of the middle bars represent extra repeats 
after the program end of messages already 
repeated once. The rightmost bar in each 
group represents the total number of re- 
peats, i.e., the sum of the first two bars. 


per cent of the messages were listened 
to a second time under the fixed-stor- 
age condition and 13.5 (14.7) per cent 


under no-storage. However, in the 
no-storage case all were verbally re- 
quested repeats, while in the fixed- 
storage case the majority were ma- 
chine stored, especially in the present 
(problem-solving) experiment (11.6). 
Third, the number of verbally re- 
quested repeats, was greater in the 
former (transcription) experiment, at 
least for the two storage conditions. 
Apparently, if preoccupied with prob- 
lem solving, the operator has less 
tendency to ask for repeats as the 
messages are coming in. Another fac- 
tor involved is that in this experiment 
the operator could (and he knew he 
could) ask for repeats after the pro- 
grammed messages stopped (The con- 
sequences of this will be discussed 
later with reference to Figure 5). In 
the former experiment he had to ask 
for them immediately or not at all. 
Fourth, there is remarkably good 
agreement between the delay times? 
associated with repeated messages in 
the two experiments. It is coincidental 
that the average delay for the read- 
out-on-demand stored messages turned 
out to be the same value as the fixed 
delay of the fixed-storage simulation. 
The percentages of objects that were 
not consolidated by the end of the 
last programmed message are given 
in the last column of Table 1. The 
differences among conditions on this 
measure are less than for errors or 
repeats. 

At the end of the programmed mes- 
sages in the problem-solving experi- 
ment the operators continued toward 
a complete solution by asking for re- 
peats of blocks of messages until they 
had all messages transcribed and all 


*The ‘delay’ was the time in seconds be- 
tween the first (programmed) time a mes- 
sage came in and the time the same mes- 
sage came in again because of either a ver- 
bal request or machine storage. 








150 JOURNAL OF SPEECH AND HEARING RESEARCH 


TaBLE 2. Mean square deviations and significance results from variance analysis of the four sets 


of measurements. 











Source df Transcription Objects Repeats Time 

ms ms ms ms 

O (operators) 7 33.01 13.79 91.00 25.22 
P (problems) 5 215.82* 15.22* 189.84* 19.86* 
C (conditions) 2 584.55 8.85 1433.00 70.52t 
OP 35 12.04 2.29 26.37 4.20 
OC 14 40.31 7.13 61.40 11.59 
PC 10 35.51fT 2.30 29.12 2.57 
OPC 70 14.14 1.86 29.49 4.02 








*P/OP; significant at or beyond 1% level. 
{C/OC; significant at or beyond 1% level. 


TttPC/OPC; significant at or beyond 5% level. 


objects plotted. The most important 
criterion of over-all performance was 
the time required by the operators to 
accomplish this. Figure 5 shows this 
added time required to complete the 
solution, the repeats required both 
during and after the message program, 
together with the transcription errors 
and objects yet to be plotted at the 
end of the programmed messages 
(taken from Table 1). Note that in 
the readout-on-demand condition 
where the transcription error score 
was on the order of one-half the error 
score of the other two conditions, the 
time-to-finish was one-half. The re- 
sults on repeats show that, as the 
transcription error increased a small 
amount, the number of repeats in- 
creased greatly. The bulk of the re- 
peats occurred after (A) the end of 
the programmed messages. The values 
for repeats requested during (D) the 
programmed messages were taken 
from Table 1. The shaded portion at 
the top of each A bar represents the 
number of messages that were repeated 
more than once after the program 
end. These extra repeats reflect op- 
erator confusion, since there was no 
reason that the operator should re- 


quire more than one repetition of any 
message or message group during this 
final period.* The final bar is merely 
the sum of the first two. Note that 
under the fixed-storage condition the 
operators asked for relatively few re- 
peats during the programmed mes- 
sages. Reference to Table 1 shows 
that under this condition they were 
using the storage provision for 11.6 
per cent of the messages. The fact that 
the error score and the number of 
repeats requested after the end of pro- 
grammed messages are highest for 
this condition implies that the stored 
messages were not efficiently handled. 
On the other hand, in the no-storage 
condition the operators apparently re- 
duced their error scores by request- 
ing more repeats during the message 
program. This tended to reduce the 
number of repeats needed after the 





*Some of the repeats represented by both 
unshaded and shaded portions of the A bars 
may be of messages already repeated during 
the program (D bars). Most of this duplica- 
tion resulted from the fact that after the 
program end, message repeats had to be re- 
quested by quadrant and the talker answered 
with up to four messages, although the op- 
erator may have needed only one. 




















THOMPSON, WEBSTER, KLUMPP AND BERTSCH: VOICE STORAGE _I5I 


end of the message program, although 
the total number of repeats was great- 
est under this condition. 


It is interesting to note that the 
number of repeats coming after the 
program end is also considerably 
smaller under readout-on-demand 
storage than under the other con- 
ditions. 


Table 2 shows the analysis-of-vari- 
ance results of the four main sets of 
data in terms of mean square deviation 
and degree of significance. The sig- 
nificances of the differences among 
problems (P), the differences among 
conditions (C), and the PC inter- 
action were evaluated using the mean 
square ratios indicated in the foot- 
notes. The differences among opera- 
tors (O) aud the operator interactions 
were not evaluated since there was 
no valid error term with which they 
could be compared. Differences among 
problems and among conditions were 
significant at the one per cent con- 
fidence level in all cases except for 
conditions in the object-consolidation 
data. 


Although no valid test of signifi- 
cance can be made of the OC inter- 
action, it is quite obvious that the 
mean squares associated with this in- 
teraction are always large as com- 
pared with the triple interaction. This 
implies caution in generalizing for all 
operators over all conditions. This 
plus the fact that the interaction of 
problems with conditions (PC) was 
significant for transcription error sug- 
gested a need for further analysis by 
operator and problem subgroups. 
Consequently the results were re- 
analyzed according to the following 
categories: the four fastest operators 
and the four slowest operators; the 
three easiest problems and the three 
hardest problems. This breakdown 
was made on the basis of the no-stor- 


age condition average scores on the 
time-to-finish criterion, which was 


SS reavour. sy Zs 
S ON-DEMAND stoRAGe “Sf HIKED STORAGE Z, NO STORAGE 












FASTER OPERATORS 
easy 
PROBLEMS 


SLOWER OPERATORS 
HARD 





on &. 8 8 


| 
+ 


TIME TO FINISH PROBLEMS (MINUTES) 


° 
r Z 
i kes 

| 

| 

| 


r 


REPEATS (PER CENT) 


OBJECTS UNFINISHED (PER CENT) 


3 8 


TRANSCRIPTION ERROR (PER CENT) 
r) 


—) econ? 


Figure 6. Results of faster and slower op- 
erators on easy and difficult problems. Time 
to finish problems was measured from the 
end of the message programs. Repeats rep- 
resent the total repeats as given in Figure 
5. Objects unfinished and transcription 
error are measurements taken at the end of 
the message program. The stars and 
brackets indicate significant differences - be- 
tween conditions. Two stars indicate that 
the mean of the difference distribution for 
the conditions bracketed is significantly dif- 
ferent from zero beyond the one per cent 
level of confidence according to a standard 
table of t-ratios. A single star indicates sig- 
nificance beyond the five per cent level. 
The difference distribution in each case was 
based on 12 differences (four operators on 
three problems). 








152, JOURNAL OF SPEECH AND HEARING RESEARCH 


considered to be the most crucial per- 
formance measure. The results of this 
analysis are shown in Figure 6. 


It is evident from the results on the 
time-to-finish criterion that for the 
faster operators, regardless of prob- 
lem difficulty, readout-on-demand- 
storage was no improvement over no- 
storage. However, for the slower op- 
erators readout - on - demand - storage 
was advantageous on both easy and 
hard problems as compared with 
either fixed-storage or no-storage. Al- 
though the scores under fixed-storage 
were better than under no-storage for 
these operators, the difference was 
not significant. 


The next most important measure 
operationally is the number of mes- 
sages repeated due to verbal request 
because repeats mean delay and extra 
circuit load. Repeats are shown in 
terms of percentages which were ob- 
tained by dividing the total number 
of repeats, both during and after the 
programmed messages, by the number 
of relevant programmed messages. For 
easy as well as hard problems both 
groups of operators gained by using 
readout-on-demand-storage as com- 
pared with either no-storage or fixed- 
storage. For the faster operators fixed- 
storage seemed to be a hindrance as 
compared with no-storage although 
the difference between conditions was 
not significant. For the slower opera- 
tors fixed-storage was an improve- 
ment over no-storage. 


Results on two interim criteria of 
performance are shown in the bot- 
tom half of Figure 6. The differences 
between conditions in object con- 
solidation were relatively small and 
only four comparisons were signifi- 
cant. On transcription error, readout- 
on-demand-storage was significantly 
better than the other conditions in all 
but one comparison. 


In an over-all view of Figure 6 it 
appears that no simple relationship 
exists between the time required to 
finish a problem and any of the other 
criteria of performance. 


Discussion 


Effectiveness of a storage device in 
simplifying a listening task can prob- 
ably be best explained in terms of the 
operator’s control over the time se- 
quence of listening to messages. In 
this experiment the no-storage con- 
dition gave the operator no control 
over the time sequence. The readout- 
on-demand-storage condition allowed 
complete control. Lying between 
these extremes of message-density con- 
trol, the fixed-storage condition gave 
partial control. Since complete con- 
trol improved performance appreci- 
ably in most cases, it might be ex- 
pected that partial control would im- 
prove performance at least to some 
extent. The fact that the partial con- 
trol in the fixed-storage condition did 
not improve performance is probably 
due to such factors as (1) lack of 
positive sequence control (that is, 
there is no assurance that a message 
delayed for a fixed interval will not 
still compete with messages from 
other channels); and (2) unavoidable 
message loss (that is, by switching 
a channel from ‘Store’ back to ‘Direct’ 
the operator loses whatever may have 
come into storage during the last nine 
seconds). ‘10 resolve the problem of 
competition between a_ particular 
stored message on one channel and 
messages on other channels, extra de- 
lay intervals could be provided so 
that the operator would have one or 
more extra chances to hear the stored 
message. The difficulty with this 
procedure is that now, when the op- 
erator switches back to ‘Direct,’ the 




















_— — 





EE 





THOMPSON, WEBSTER, KLUMPP AND BERTSCH: VOICE STORAGE 153 


potential loss of involuntarily stored 
messages on that channel will have 
increased proportionately. 

In this experiment, then, the fixed- 
storage provision increased the opera- 
tor’s response complexity without 
satisfactorily resolving the competing- 
message problem. On the other hand, 
the readout-on-demand-storage pro- 
vision simplified the operator’s over- 
all task by completely eliminating 
message competition without substan- 
tially increasing response complexity. 
In response to questionnaires given 
after completion of all testing, all but 
one of the operators rated readout-on- 
demand-storage as the condition they 
most preferred primarily because it 
tended to reduce frustration, tension, 
and fatigue. All operators thought the 
fixed-storage condition was difficult. 
At least two operators mentioned 
extra tension built up by the nine- 
second wait for a stored message in 
the fixed-storage condition. The ma- 
jority felt that increased problem 
length and/or message density would 
only increase their preference for 
readout-on-demand-storage. 

As compared with no-storage, read- 
out-on-demand-storage did not de- 
crease the time required by the faster 
operators to finish the problems. Ap- 
parently for these operators the mes- 
sage density and/or the five-minute 
program length was not great enough 
to differentiate decisively between 
these two conditions. However, the 
task was sufficient to prove the de- 
ficiencies of the fixed-storage pro- 
vision for both groups of operators. 
Even the most proficient operators 
indicated in their questionnaire re- 
sponses that under the no-storage and 
fixed-storage conditions they did as 
well as they did only by great con- 
centration and effort and that inten- 
sification or extension of the task 


would have resulted in saturation and 
probable breakdown in performance. 


Summary 


Two message storage schemes 
(fixed-storage and _readout-on-de- 
mand-storage), providing an operator 
with differing degrees of control over 
the time sequence of listening to pro- 
grammed voice messages, were evalu- 
ated in a four-channel communication 
net. A control condition (no-storage), 
which gave the operator no control 
over sequence of listening, was used 
for comparison. The operator was re- 
quired to solve a plotting board prob- 
lem using information contained in 
the messages. The messages for a par- 
ticular problem were programmed in 
advance to a schedule that, on the 
average, consumed five minutes. The 
message density varied from simul- 
taneity of three messages to inter- 
message gaps of 20 seconds. To suc- 
cessfully solve a problem an operator 
had to transcribe the message infor- 
mation onto the board and consolidate 
it. When essential information was 
missed, he obtained it by verbal re- 
quest for message repeats. 


As compared with the no-storage 
(control) condition, the readout-on- 
demand-storage condition, which al- 
lowed the operator complete time-of- 
listening control, substantially cut 
down the number of errors (by about 
one-half), the number of repeats re- 
quested (by about three-fourths), and 
the time required to correctly solve 
the problem (by one-half). Perform- 
ance under the fixed-storage con- 
dition was essentially the same as 
under the no-storage condition. Anal- 
ysis of the results by faster and slower 
operator groups indicated that the 
slower operators performed signifi- 





154 


JOURNAL OF SPEECH AND HEARING RESEARCH 


cantly better under readout-on-de- 
mand-storage, while for the faster op- 
erators this condition resulted in sig- 
nificantly improved performance in 
terms of repeats and transcription er- 
rors, but not in the time required to 
finish the problems. 


References 


I. 


nN 


BertscH, W. F., Wessrer, J. C.,, 
Kiumpp, R. G., and THompson, P. O., 
Effects of two message-storage schemes 
upon communications within a small 
problem-solving group. J. acoust. Soc. 
Amer., 28, 1956, 550-553. 


. Broapsent, D. E., The role of auditory 


localization in attention. Gt. Brit. Royal 
Naval Comm. Rep., RNP 52/718, 1952. 


. Ecan, J. P., Carrerette, E. C., and 


Tuwine, E. J., Some factors affecting 
multi-channel listening. J. acoust. Soc. 
Amer., 26, 1954, 774-782. 


. Mitten, G. A., Language and Com- 


munication. New York: McGraw-Hill 


Book Co., Inc., 1951. 


3 


Pottack, I., and Trcce, J., Standardized 
communications and message reception. 
J. acoust. Soc. Amer., 30, 1958, 62-64. 


. SpretH, W., Curtis, J. F., and Wesster, 


J. C., Responding to one of two simul- 
taneous messages. J. acoust. Soc. Amer., 
26, 1954, 391-396. 


. SpretH, W., and Wesster, J. C., Listen- 


ing to differentially filtered competing 
voice messages. J. acoust. Soc. Amer., 
27, 1955, 866-871. 


. Stevens, S. S., Handbook of Experi- 


mental Psychology. New York: John 
Wiley and Sons, Inc., 1951. 


. Wesster, J. C., and Suarpe, L., Im- 


provements in message reception re- 
sulting from ‘Sequencing’ competing 
messages. J. acoust. Soc. Amer., 27, 
1955, 1194-1198. 


. Wesster, J. C., and Sotomon, L. N., 


Effects of response complexity upon 
listening to competing messages. J. 
acoust. Soc. Amer., 27, 1955, 1199-1203. 


. Wesster, J. C., and THompson, P. O., 


Factors affecting speech intelligibility in 
aircraft control towers. U. S. Navy 
Electron. Lab. Rep., 357, March 1953. 














$e rrr 





Some Variables Affecting Perceived Harshness 


Maryjane Rees 


Harshness is frequently listed as one 
of the major classifications of voice 
quality defects even though much 
disagreement exists among such classi- 
fications. Ambiguity of terminology 
has been a major contributor to this 
lack of agreement. Comprehensive 
discussions of voice defects, however, 
either include the category harshness 
or employ descriptive terms which 
are usually considered to be synony- 
mous with harshness such as strident, 
raucous, rough, or rasping. Little re- 
search has been conducted on harsh- 
ness, and interpretation of the data 
that are available is confused by un- 
certainty with regard to precisely 
what vocal characteristics are being 
considered. Only one study (22) deal- 
ing unequivocally with harshness 
was found. A second study (2) con- 
cerned with simulated harshness was 
available. 

Whether the interest be in the the- 
oretical aspects of voice production 
or in application of knowledge to 
clinical procedures, it is evident that 
considerably more information is 
needed than is now available. One 
approach to a better understanding 





Maryjane Rees (Ph.D., State University 
of lowa, 1954) is Assistant Professor of 
Speech in charge of the Speech and Hear- 
ing Center at Sacramento State College. 
This article is based on a doctoral dis- 
sertation completed under the direction of 
Professor Dorothy Sherman. 


of the nature and characteristics of 
harshness would be the identification 
of conditions under which this voice 
quality varies. As a point of departure, 
conditions which produce perceptu- 
ally exident variability of this voice 
quality would indicate significant 
areas for further investigation. 

To be useful, the sine qua non of 
such an investigation is the identifica- 
tion of the voice quality under con- 
sideration in such a manner that sub- 
sequent researchers could continue to 
add to the present information with 
reasonable confidence that their data 
pertain to the same entity. 

For the purpose of this investiga- 
ton, then, harsh voice quality is de- 
fined as that voice quality agreed 
upon by qualified listeners when in- 
structed to identify harshness accord- 
ing to the description proposed by 
Curtis (5), “Harsh voice quality has an 
unpleasant, rough, rasping sound. It 
is often heard in people for whom 
voice production seems to be a con- 
siderable effort or strain.’ Sherman 
and Linke (22) have shown that the 
voice quality identified by employing 
the above description can be reliably 
scaled for short samples of speech. 

Several considerations dictated the 
selection of variables for investigation 
in this study. First, it had already been 
demonstrated by Sherman and Linke 
(22) that degree of harsh voice quality 
varies as a function of vowel types. 
They had speakers with harsh voices 


ti. 





156 JOURNAL OF SPEECH AND HEARING RESEARCH 


read passages that differed with re- 
spect to vowel content. The vowels 
had been dichotomized in three ways: 
(1) tense and lax, (2) front and back 
and (3) high and low. Each vowel 
appeared in more than one classifica- 
tion and the contribution of individual 
vowels to differences among passages 
with respect to severity of harshness 
could not be determined. On the basis 
of their obtained results, it seemed 
evident that differences among vowels 
of any one classification must exist. 
Second, secondary acoustical charac- 
teristics of vowels are affected by con- 
sonant environments (10). Voicing, 
manner of production, and place of 
articulation of consonants adjacent to 
vowels cause frequency, duration and 
intensity changes in the vowels. Such 
changes might be associated with ex- 
acerbation or diminution of harshness. 
Third, some observers (3, 7) have as- 
sociated abruptly initiated vowels with 
harsh voice quality. One author (7) 
has recommended gradual initiation of 
vowels by releasing them with the 
consonant [h] as corrective drill for 
harshness. It was hypothesized, there- 
fore, that isolated vowels would differ 
in harshness from vowels initiated 
more gradually. 


In accordance with the above- 
mentioned considerations, materials 
were chosen to allow study of the 
influence of vowels, the influence of 
some selected consonant environments 
and the influence of type of vowel 
initiation on perceived harsh voice 
quality. The following questions were 
proposed: (1) In what order do 


vowels influence the degree of per- 
ceived harsh voice quality? (2) To 
what known parameter of the vowel 
might this order be related? (3) Is 
degree of perceived harshness of the 
vowel related to its environment? 
(4) Do voiced and voiceless conso- 


nant environments cause differences in 
degree of perceived harshness? (5) 
Do stop-plosive and fricative conso- 
nant environments cause differences in 
degree of perceived harshness? (6) Is 
abruptness of vowel initiation related 
to severity of harshness? 


Procedure 


Selection of Material. In order to 
make comparisons among individual 
vowels and to study the influence of 
consonant environments and type of 
vowel initiation, simple contexts which 
would not be affected by pitch, loud- 
ness and stress variations operating in 
connected speech were necessary. 
Therefore, isolated vowels and sylla- 
bles of the consonant-vowel and con- 
sonant-vowel-consonant type were 
chosen. 

Only vowels used in General Amer- 
ican dialect were selected. Diphthongs 
and the vowels [e] and [0] which 
are often diphthongized were ex- 
cluded. Consonant environments were 
chosen which were known to influ- 
ence physical characteristics of vowels 
produced by speakers with normal 
voices. House and Fairbanks (10) 
found that the greatest differences in 
frequency, duration and _ intensity 
characteristics obtained when vowels 
in voiced and voiceless environments 
were compared. Differences also ob- 
tained between vowels in stop-plosive 
and fricative environments. Individual 
consonants within these classifications 
were selected on the basis of avail- 
ability of information concerning their 
influence of vowels. The consonant 
[h] was also chosen so that the vowel 
thus initiated could be compared with 
the vowel initiated without a releasing 
consonant. 


The final selection of material was 
as follows: 











ees. 














———————eeeeeeQESE 


oe 











REES: VARIABLES AFFECTING HARSHNESS _ 157 


I. Vowels: [i], [1], lel, [el], [a], even stress from syllable to syllable. 


lal, fo], [ul], and [ul] 


II. Consonant environments: 
A. Voiced 
1. Stop-plosives [d], [g] 
2. Fricatives [v], [z] 
B. Voiceless 


1. Stop-plosives [t], [k] 
2. Fricatives [f], [s] 


III. Initiation environments: 
A. [h] or gradual 


B. No releasing consonant, that is 
the isolated vowel. 


Each vowel was combined with 
each consonant except [h] to form 
a consonant-vowel-consonant syllable 
with the same consonant in initial and 
final positions resulting in 72 sylla- 
bles. Each vowel was combined with 
the consonant [h] to form a conson- 
ant-vowel syllable resulting in an ad- 
ditional nine syllables. Each of the 
nine vowels was also included. The 
final syllable list thus contained 90 
items. 


Selection of Speakers. Twenty-four 
adult males with clinically diagnosed 
harsh voice quality recorded samples 
of connected speech. From these re- 
cordings, four experienced listeners 
selected only those whose voices were 
harsh and without other quality devia- 
tions. The _ listeners unanimously 
agreed that 12 of the voices met these 
criteria. All speakers spoke General 
American dialect and none had articu- 
lation errors. 


Recording Procedure. The 90-item 
syllable list was independently ran- 
domized for each speaker by a table 
of random numbers. Prior to record- 
ing, each speaker repeated the sylla- 
bles after the experimenter. They were 
instructed to produce each syllable 
with a level inflection and to maintain 


After all syllables were judged to be 
satisfactory, they were recorded in 
the above manner. Three experienced 
listeners checked each recorded sylla- 
ble against the stimulus list. Syllables 
which deviated from a _ reasonably 
level inflection pattern or which were 
not good examples of the intended 
phoneme were repeated until they 
were unanimously judged to be satis- 
factory. Recordings were made in an 
anechoic room at a tape speed of 15 
inches per second. The recording 
microphone was an Altec 21 C con- 
denser microphone and the recorder 
a high fidelity Presto RC 10/24. 


Preparation of the Experimental 
Tape. The 12 speakers, recording 90 
syllables each, produced the 1080 syl- 
lables comprising the experimental 
tape for this study. The 1080 syllables 
were spliced into a tape in random 
order predetermined by a table of 
random numbers. The restriction that 
no two syllables from the same voice 
appear adjacent to each other was 
imposed on the randomization. 

In splicing, the syllables were sep- 
arated to allow space for dubbing in 
the sample number, repetition of the 
syllable, and time for rating. The 
tape transport mechanisms of the 
Presto RC 10/24 and the Presto PB 
10/24 were placed adjacent to each 
other so that the playback heads were 
separated by a distance of 12 inches. 
The master tape was run over both 
playback heads and the signal was 
passed through a mixing circuit to a 
Concertone Model 1401 recorder 
where it was recorded at a tape speed 
of 15 inches per second. This proce- 
dure resulted in a tape on which each 
of the original items could be heard 
twice. 


Scaling Procedure. Thirty-two 





158 JOURNAL OF SPEECH AND HEARING RESEARCH 


graduate students who had had train- 
ing in the diagnosis of voice quality 
defects served as listeners. They first 
heard the verbal description of the 
harsh voice quality mentioned pre- 
viously. They were next acquainted 
with the range of harshness in the 
sample by listening to passages of con- 
nected speech recorded by the four 
speakers who had been previously 
judged by three experienced listeners 
to be the two most harsh and the two 
least harsh speakers of the experi- 
mental group. The third step in prep- 
aration for scaling was listening to 
three recorded series of seven sylla- 
bles each. Within each series there was 
a syllable representing each of the 
seven scale intervals in order of in- 
creasing harshness. This tape had been 
prepared by having two experienced 
listeners rate 150 of the syllables on 
a seven-point scale by the ‘method of 
equal- appearing intervals. Two trials 
were given. Only those syllables on 
which there was agreement both be- 
tween listeners and between trials 
were used in the training tape. Finally, 
the last 30 of the original 1080 syllables 
which had been dubbed on a second 
tape were scaled for practice. 


The 1080 syllables were then rated 
on a seven-point scale by the method 
of equal-appearing intervals with one 
representing least severe harshness and 
seven representing most severe harsh- 
ness. The first 100 syllables were re- 
peated at the end of the experimental 
tape to provide for a reliability esti- 
mate. While the scale values from the 
first 100 syllables were used in com- 
puting a Pearson 7, only the scale 
values from the second trial were 
used in the analysis of the data thus, 
in effect, giving the listeners practice 
on 130 syllables. 


Instrumentation for the listening 
sessions consisted of a Concertone 


Model 1401 tape playback, a MaclIn- 
tosh Model 50W2 amplifier, and a 
Jensen BF 409 multiple speaker sys- 
tem. Listening was done in a sound- 
treated room. The scaling task was 
accomplished in two sessions. The 
first session was one -hour and 45 
minutes long interrupted mid-way by 
a short rest period. Instructions and 
training, including rating the 30 prac- 
tice syllables, were repeated at the 
beginning of the second session, one 
day !ater, which lasted one hour and 
30 minutes. 


Results 


Scale Values. Scale values of sever- 
ity of harshness were derived from 
judgments of the 32 listeners who 
rated the 1080 syllables on a seven- 
point scale by the method of equal- 
appearing intervals with one repre- 
senting least severe harshness and 
seven representing most severe harsh- 
ness. Median scale values and Q-values 
were obtained from these judgments 
in the manner described by Thurstone 
and Chave (24). 

A Pearson r of .90 was obtained 
between repeated scalings of the first 
100 syllables. The difference between 
means of the two trials, 3.78 and 3.53 
respectively (.25), is significant at the 
one per cent level (t=4.17). The 
trend was toward lower judgments 
on the final 100 syllables. Any trend 
which existed subsequent to the 
judgments on the first 100 syllables 
would contribute to the unreliability 
of the measures. The magnitude of 
the difference, however, is quite small. 
The mean Q-value for the 1080 scale 
values is .79. The obtained scale values 
were thus considered to be satisfac- 
torily reliable. 








ee 








REES: VARIABLES AFFECTING HARSHNESS 159 


TaBLe 1. Summary of analysis of variance for evaluation of mean severity ratings of vowels and 











environments. 
Source df 8s ms 1 ag F ost 
Vowels (V) 8 88.23 11.03 26.26 2.05 
Environments (E) 9 50.07 5.56 8.69 1.99 
Speakers (S) 11 689.27 62.66 
VE 72 54.17 75 1.92 1.32 
VS 88 37.25 42 
ES 99 63.45 .64 
VES 792 310.97 .39 
Total 1079 1293.41 








*F — ratios: msy/msyg; msp/mspg; MSyp/MSyRs. 


TF 95 is the tabled value for the nearest given df. 


Rank Order of Vowels According 
to Degree of Severity of Harshness. 
Severity of harsh voice quality varies 
as a function of vowels as indicated 
by the significant’ result of the F-test 
of the effect of vowels (See Table 1). 
The various individual environments, 
however, exert a relatively different 
influence from vowel to vowel as seen 


‘The five per cent level of significance 
was chosen for all statistical tests in this 
study. 


from the significant vowels-by-envi- 
ronments interaction. 

To determine whether the differ- 
ential effect of individual environ- 
ments from vowel to vowel is related 
to type of environment, three separate 
analyses were made, each of which 
included only those environments of a 
given type. No significant interaction 
was found between vowels and the va- 
rious voiced consonant environments, 
between vowels and the various voice- 
less consonant environments, or be- 
tween vowels and initiation environ- 


TaBLE 2. Summary of analysis of variance for evaluation of mean severity ratings of vowels and 
voiced consonant environments. 











Source df ss ms F* F ost 
Vowels (V) 8 23.41 2.93 t.#1 2.05 
Consonants (C) 3 11.73 3.91 11.85 2.90 
Speakers (S) 11 263.89 33.08 

vc 24 8.97 Yi 1.00 1.57 
Vs 88 33.61 38 

Cs 33 10.92 .33 

vcs 264 98.01 30 

Total 431 550.54 








*F — ratios: msy/msyg; msc/ms¢g; mSyc/Msycg-. 
TF o5 is the tabled value for the nearest given df. 





460 JOURNAL OF SPEECH AND HEARING RESEARCH 





TasLe 3. Summary of analysis of variance for evaluation of mean severity ratings of vowels and 


voiceless consonant environments. 











Source df ss ms F* F ost 
Vowels (V) 8 51.58 6.45 10.75 2.05 
Consonants (C) 3 6.98 2.33 4.25 2.90 
Speakers (S) 11 218.33 19.85 

vc 24 8.43 35 .92 1.57 
Vs 88 53.01 .60 

CS 33 18.07 .55 

VCS 264 101.81 .38 

Total 431 457.58 








*F — ratios: msy/msyg; msc/mscgg; MsSyc/mMSycs. 


tF 05 is the tabled value for the nearest given df. 


ments. Summaries of these analyses 
appear in Tables 2, 3, and 4. From 
these results it may be assumed that 
(1) voiced consonant environments 
[d], [g], [v] and [z] exert the same 
relative influence on perceived harsh- 
ness from vowel to vowel, (2) voice- 
less consonant environments [t], [k], 
[f] and [s] exert the same relative 
influence from vowel to vowel and 
(3) initiation environments consisting 
of the releasing consonant [h] and of 
no releasing consonant exert the same 
relative influence from vowel to 


vowel. 

A significant voicing-by-vowels in- 
teraction (See Table 5) indicates that 
voiced consonant environments and 
voiceless consonant environments have 
relatively different effects upon harsh- 
ness from vowel to vowel. Initiation 
environments could not be included in 
this analysis because there was no 
way of combining the individual initi- 
ation environments with individual 
voiced and voiceless consonant envi- 
ronments for statistical analysis. By 
inspection of the data in Table 6, 


TaBLE 4. Summary of analysis of variance for evaluation of mean severity ratings for vowels and 


initiation environments. 











Source df 8s ms J ie F ost 
Vowels (V) 8 26.90 3.36 7.15 2.05 
Initiation (I) 1 5.22 5.22 6.14 4.84 
Speakers (8) 11 132.15 12.01 

VI 8 5.11 64 1.45 2.05 
vs 88 41.69 47 

Is 11 9.35 .85 

VIS 88 38.73 44 

Total 215 259.19 








*F — ratios: msy/msyg; msy/msjg; msyz/msyyjg. 


TF 95 is the tabled value for the nearest given df. 




















REES: VARIABLES AFFECTING HARSHNESS 161 


TaBLe 5. Summary of analysis 


of variance for evaluation of mean severity ratings of vowels, 
voicing and cognates. 











Source df 8s ms | ag F ost 
Voicing (Vg) 1 17.63 17.63 11.23 4.84 
Cognates (C) 3 16.80 5.60 10.37 2.90 
Vowels (Vo) 8 64.83 8.10 15.58 2.05 
Speakers (S) 11 564.98 51.36 

VgC 3 1.91 64 1.88 2.90 
VgVo 8 10.17 1:27 2.76 2.05 
CVo 24 9.53 40 1.05 1.57 
VgeS ii! 17.24 1.57 

CS 33 17.84 .54 

VoS 88 46.19 .52 

VgCVo 24 7.86 33 .87 1.57 
VgCs 33 11.15 34 

VgVoS 88 40.41 46 

CVoS 264 99.36 .38 

VgCVoS 264 99.85 .38 

Total 863 1025.75 








*F — ratios: msyg/mSygg; msc/mscg; MSyo/MSyog; MSyg¢/MSygcs; 
MSygVo/MSygVoSi MSCVo/MSCVos; MSygCVo/MSygCvos, 
TF 95 is the tabled value for the nearest given df. 


however, the assumption that initia- initiation—exert differential effects on 








tion environments would also con- 
tribute to the interaction of types of 
environments with vowels seems rea- 
sonable. Granting this assumption, 
these results’ indicate that types of 
environment—voiced, voiceless, and 


perceived harshness from vowel to 
vowel but that similar environments 
exert the same relative influence from 
vowel to vowel. The mean scale values 
for the vowels in the various environ- 
ments along with the critical differ- 


TaBLE 6. Mean severity ratings for vowels in voiced consonant environments, in voiceless consonant 
environments, in initiation environments and in all environments combined. The critical difference 
(c.d.*) is the difference necessary for significance at the five per cent level. 











Voiced Voiceless 
Consonant Consonant 

Environments Environments 
[i] 3.44 li] 3.02 
{ul 3.46 [ul 3.05 
[1] 3.68 [u] 3.21 
[A] 3.76 [1] 3.45 
[e] 3.92 [e] 3.67 
[u] 3.97 [ez] 3.74 
[ee] 4.02 [A] 3.79 
[a] 4.04 [a] 3.91 
[a] 4.09 [ol 3.97 
c.d.* .24 c.d. .3l 





Initiation Environments 
Environments Combined 
{u] 3.35 [i] 3.26 
[i] 3.37 [u] 3.28 
[1] 3.63 [1] 3.58 
[A] 3.84 [u] 3.67 
[u] 4.00 [A] 3.79 
[a] 4.14 [e] 3.89 
[ee] 4.19 [2] 3.94 
[e] 4.26 [a] 4.01 
[o] 4.33 {o] 4.09 

c.d. 39 c.d. .16 











*e.d.=t.5(2 mSprror/n) 1/2. See Tables 1 through 4 for error terms employed. 








162 JOURNAL OF SPEECH AND HEARING RESEARCH 


TaBLE 7. Shifts in relative severity of vowels as a result of environment influence. The vowels 
are listed in order from least to most severely harsh. For each type of environment the vowels in 
Groups 1 and 3 do not differ significantly within the group. The vowels in Group 2 differ sig- 
nificantly from the extremes of least harsh vowels in Group 1 and most harsh vowels in Group 3. 











Environments Vowels 
Group 1 Group 2 Group 3 
Voiced [i] [ul [x] [A] {e] [vu] [e#] [a] [o] 
Voiceless [i] [u] [vu] {1] {e] [ez] [a] [a] [ol 
Initiation {ul fi] [A] [uv] [a] [ee] [e] [ol 








ences necessary for significance ap- 
pear in Table 6. 

Further examination of Table 6 
shows that for all environments com- 
bined, the obtained order of vowels 
with respect to increasing severity of 
harshness is[i], [u], [1], [u], [4], [e], 
[2], [a] and [9]. This order shows 
few significant changes with separate 
analyses for each of the three types of 
environments. The vowels [i] and 
[u] are always rated least harsh re- 
gardless of type of environment. In 
general, [1], [vu], [a] and [e] are 
rated as moderately harsh in varying 
order depending upon environment. 
The vowels [2] and [a] are rated 
relatively severely harsh with [9] 
rated most harsh regardless of type 
of environment. 


The vowels [uv], [1] and [a] are 
most influenced by the differential 
effects of type of environment. The 
vowel [vu] is relatively mild in harsh- 
ness in voiceless consonant environ- 
ments and relatively severe in harsh- 
ness in voiced consonant and initiation 
environments. The vowel [1] is 
relatively mild in harshness in initiation 
environments and becomes more harsh 
in consonant environments of either 
type. The vowel [a] is relatively 
severe in harshness in voiceless con- 
sonant environments and becomes less 
severe in voiced consonant and initia- 
tion environments. 

The differing relationships among 
vowels in the three types of environ- 
ments can be demonstrated by group- 
ing the vowels as follows: (1) least 


Taste 8. Differences in severity ratings for vowels averaged for cognate pair environments. The 
critical difference (c.d.*) necessary for significance at the five per cent level is .14. Significant dif- 


ferences are underlined. 














Mean Severity Rating Cognate Pairs Cognate Pairs 
{g], {k) [vl, [f] [zl], [s] 
3.55 (a), [t) 02 .29 25 
3.53 {g], _ {k] 31 27 
3.84 (vl, ff 04 
3.80 [z], _ [sl 








*e.d. = t.95(2 m8 error/n)1/2. See Table 5 for error term employed. 


























REES: VARIABLES AFFECTING HARSHNESS 163 


TaBLE 9. Mean severity ratings for vowels in consonant environments and initiation environments. 
The critical differences (c.d.*) necessary for significance at the five per cent level are .16 for con- 
sonant environments and .22 for any comparison including an initiation environment. 











Environment Mean Rating for Vowels 
{k] 3.36 
[t] 3.47 
[fl 3.63 
[d] 3.64 
[s] 3.68 
[g] 3.69 
[h] 3.75 
[z] 3.91 
[v] 4.05 
No Consonant 4.06 








*e.d. = t.95(2 mSepror/n)1/2. 


See Tables 5 and 1 for the error terms employed in comput- 


ing the critical differences for consonant environments and for comparisons including an initia- 


tion environment, respectively. 


harsh vowels which do not differ 
significantly from each other, (2) 
vowels which differ from both the 
least harsh vowel and the most harsh 
vowel and (3) most harsh vowels 
which do not differ significantly from 


each other. This grouping is shown in 
Table 7. 


The Influence of Consonant Envi- 
ronments on Harshness of Vowels. 
Since there are so few significant 
changes in the rank order of vowels, 
an evaluation of the differences be- 
tween voiced and voiceless environ- 
ments for all vowels combined seems 
justified even though there is a sig- 
nificant voicing-by-vowels interaction 
as reported in Table 5. Vowels in 
voiced consonant environments are 
significantly more harsh than vowels 
in voiceless consonant environments. 
The mean severity ratings for vowels 
in these environments are 3.82 and 
3.54, respectively. 

Vowels in fricative environments 
are significantly more harsh than 
vowels in stop-plosive environments 
when severity ratings of vowels in 
each voiced and voiceless cognate 


pair of consonants are averaged. Inter- 
comparisons as shown in Table 8 indi- 
cate that the ratings for vowels in 
the fricative environments are remark- 
ably close together as are the ratings 
for vowels in’ stop-plosive environ- 
ments. Comparisons among cognate 
pairs show that vowels in either of 
the fricative environments are signifi- 
cantly more harsh than vowels in 
either of the stop-plosive environ- 
ments. 


The mean severity ratings for the 
combined vowels in the eight indi- 
vidual consonant environments and 
the two initiation environments along 
with the critical differences necessary 
for significance at the five per cent 
level are shown in Table 9. Compari- 
sons among individual consonant 
environments (excluding initiation 
environments) show that vowels in 
the voiceless stop-plosive environ- 
ments [k] and [t] are significantly 
less harsh than vowels in any other en- 
vironment. Vowels in the voiceless 
fricative environments [f] and [s] and 
in the voiced stop-plosive environ- 
ments [d] and [g] are significantly 
more harsh than vowels in the voice- 








164 JOURNAL OF SPEECH AND HEARING RESEARCH 


less stop-plosive environments. Vowels 
in the voiced fricative environments 
[z] and [v] are significantly more 
harsh than vowels in the other envi- 
ronments. 


The Influence of Initiation Envi- 
ronments on Harshness of Vowels. 
Vowels initiated without a releasing 
consonant are significantly more harsh 
than vowels released by the consonant 
[h] (See Tables 4 and 9). Mean sever- 
ity ratings of vowels in these initia- 
tion environments are 4.06 and 3.75 
respectively. 

Relative to vowels in other environ- 
ments, the isolated vowels are ranked 
most severely harsh although they do 
not differ significantly in severity 
from vowels in the voiced fricative 
environments [z] and [v]. Vowels 
initiated with the consonant [h], 
while less harsh than isolated vowels, 
are, however, ranked at the more 
severe end of the continuum of harsh- 
ness. Gradual initiation with the use 
of [h] does not reduce the harshness 
of vowels relative to most of the other 
consonant environments. 


Discussion 


The order in which vowels increase 
in perceived harshness is closely re- 
lated to height of tongue position. 
Kenyon’s (12) approximation of the 
relative height of tongue position pro- 
gressing from high to mid to low is 
as follows: [i], [1], [u], [ul], [e], [a], 
[2], [9], [a]. The obtained order of 
vowels ranked according to degree 
of harshness (See Table 7) corre- 
sponds closely to this progression. 

Vowels in combined environments 
show a definite progression of in- 
creasing harshness from high to mid 
to low tongue position. The high 


vowels [i], [u] and [1] are always 
significantly less harsh than the low 
vowels [9] and [a] regardless of en- 
vironment. The high vowel [v], 
which has the lowest tongue position 
of the high vowels, is rated less harsh 
than the low vowels [9] and [a] in all 
three types of environments. The dif- 
ference, however, is significant only 
for voiceless environments. The mid 
vowels [a] and [e] rank in the center 
of the continuum of harshness with 
two exceptions: [e] in initiation envi- 
ronments and [] in voiceless cnviron- 
ments in which they are relatively 
severely harsh. The low vowel [>] has 
the highest harshness rating regardless 
of type of environment, with [a] 
ranking next except in initiation envi- 
ronments. The vowel [2], with the 
highest tongue position of the low 
vowels, is rated less harsh than the 
other low vowels though not signifi- 
cantly so. 

Harshness of vowels does not seem 
to be related to either the front-back 
or to the tense-lax classification. The 
front vowels [i] and [1] are among 
the least harsh vowels but so are the 
back vowels [u] and [vu]. The two 
least harsh vowels [i] and [u] are 
tense vowels but so is the most harsh 
vowel [9]. 

With respect to high and low 
vowels, the obtained data agree with 
the Sherman and Linke (22) results 
on connected speech. Their passage 
containing low vowels was rated 
more harsh than their passage con- 
taining high vowels. However, on the 
basis of their results it was expected 
that a tense-lax relationship with 
harshness would also obtain. Further 
experimentation (21) did show that 
when these same passages were played 
backward and scaled, the passage 


containing high vowels was signifi- 
cantly less harsh than any of the 














REES: VARIABLES AFFECTING HARSHNESS 165 


other passages and the passage con- 
taining low vowels was significantly 
more harsh than any of the other 
passages. Any comparisons between 
passages containing tense, lax, front 
and back vowels were non-significant. 
The results from scaling these passages 
played backward are in complete 
agreement with the present results. 
The change in relationships among 
the passages due to backward playing 
was hypothesized to be the result of 
obviating the influence of extraneous 
factors which might affect judgments 
when passages of connected speech 
are played in the usual manner. 

Since both Joos (7/7) and Peterson 
(17) have pointed out that the con- 
ventional physiological classification 
of vowels is essentially acoustic, and 
since changes in perceived harshness 
were observed to accompany changes 
in height of tongue position of the 
vowels, some of the acoustical char- 
acteristics of vowels as they vary 
with height of tongue position were 
examined. These data, reported in the 
literature, were obtained on normal 
speakers. 


Five independent investigators (J, 
4, 10, 18, 23) agree that high vowels 
have higher fundamental frequencies 
than low vowels with the mid vowels 
falling between the extremes. Rising 
fundamental frequency associated with 
increased tongue height does not fol- 
low the physiological ordering per- 
fectly but the relationship is obvious. 

Generally, high and mid vowels 
have shorter durations than low 
vowels. Investigators (1, 4, 9, 10, 13, 
14, 15, 16, 19) of vowel duration agree 
that the high vowels [vu] and [1] and 
the mid vowels [a] and [e] have 
short durations compared with low 
vowels. No consensus has been reached 
concerning [i] and [u] since some 
investigators found their durations to 


be relatively short while others found 
them to be relatively long. 


As far as intensity is concerned, 
there is agreement among several in- 
vestigators (1, 8, 20) that high and 
mid vowels have less average intensity 
and less average peak power than do 
low vowels. 

As previously pointed out, perceived 
harshness of vowels decreases with 
increasing height of tongue position. 
The acoustical characteristics associ- 
ated with increasing height of tongue 
position of vowels and thus inferred 
to be associated with decreasing harsh- 
ness are as follows: (1) increasing 
fundamental frequency, (2) generally 
decreasing duration, (3) generally de- 
creasing average intensity and (4) 
decreasing average peak power. 


House and Fairbanks (10). studied 
the influence of 12 consonant environ- 
ments on the secondary acoustical 
characteristics of six vowels. On the 
assumption that the differences in ma- 
terial used and in control of certain 
variables would not interact with the 
variables employed in this study, their 
results contribute to the interpretation 
of the present data. An examination 
of these results for the effect of only 
those consonant environments ger- 
mane to the present study revealed 
a general patterning such that higher 
fundamental frequency, shorter dura- 
tion and less relative power were 
associated with vowels in voiceless and 
in stop-plosive consonant environ- 
ments when comparisons were made 
between vowels in voiceless and 
voiced consonant environments and 
between vowels in stop-plosive and 
fricative environments. Differences 
due to voicing were greater than dif- 
ferences resulting from change in 
manner of production. 


In the present study, harshness is 
perceptually diminished for vowels in 








166 JOURNAL OF SPEECH AND HEARING RESEARCH 


voiceless and stop-plosive consonant 
environments when comparisons are 
made between vowels in voiceless and 
voiced consonant environments and 
between vowels in stop-plosive and 
fricative environments. Assuming that 
the influence of consonant environ- 
ments on vowels is the same for harsh 
voices as it is for normal voices, then 
it may be said that higher fundamental 
frequency, shorter duration and lower 
relative power, are, in a general way, 
associated with decreased harshness. 


The same changes in both (1) 
height of tongue position of vowels 
and (2) consonant environments of 
vowels which are related to increased 
fundamental frequency, shorter dura- 
tion and less relative power of vowels 
apparently are also associated with 
diminished harshness of vowels. The 
relationship appears close enough to 
warrant direct examination of funda- 
mental frequency, duration and rel- 
ative power of vowels in speech 
samples from persons with harsh 
voices. 


Fairbanks (7) recommends the use 
of the consonant [h] before vowels to 
effect gradual initiation of the vowel 
as a corrective drill for harshness. 
This recommendation implies that 
gradual vowel initiation reduces harsh- 
ness. The results of the present study 
substantiate this opinion insofar as 
vowels released by the consonant [h] 
were perceived as less harsh than 
vowels without a releasing consonant. 
Vowels released by the consonant 
[h], however, were relatively more 
harsh than vowels in most of the other 
consonant environments employed. 


Abruptness of initiation, in the 
sense that isolated or initial vowels 
are more abruptly initiated than the 
same vowels released by a consonant, 
could be related to harshness since 
isolated vowels are perceived as sig- 


nificantly more harsh than vowels in 
consonant environments with only 
two exceptions. The consonant [h], 
however, is not unique in effecting 
gradual initiation of vowels. 


Preceding the vowel by almost any 
consonant has the effect of promoting a 
gradual bc ag The consonant not 
only provides an initial or warning low 
amplitude modulation, but is conducive 
to a more gradual build-up in the vowel 
sound itself (6). 


If the hypothesis that differences in 
harshness are due to the influence of 
the consonant on the build-up of the 
vowel is to be accepted, then, on the 
basis of results obtained in this study, 
the starting characteristics of vowels 
would be expected to be most gradual 
in voiceless stop-plosive environments, 
somewhat less gradual in voiced stop- 
plosive and voiceless fricative environ- 
ments, and relatively abrupt in voiced 
fricative environments. Starting char- 
acteristics of vowels in voiced frica- 
tive environments would be expected 


to be similar to those of isolated 
vowels. 
Perceived harshness _ apparently 


varies with a number of parameters. 
The present study has indicated varia- 
tion with height of tongue position of 
vowels, consonant environments of 
vowels and type of vowel initiation. 
Examination of acoustical data related 
to these parameters but obtained on 
normal voices suggests that funda- 
mental frequency, duration, intensity 
and starting characteristics of vowels 
might be important variables for 
further investigations of harsh voice 


quality. 


Summary 


The influence of (1) vowels, (2) 
some selected consonant environments 
and (3) vowel initiation on perceived 

















REES: VARIABLES AFFECTING HARSHNESS 167 


harsh voice quality was studied. The 
vowels were [i], [1], [e], [2], [a], 
[a], [>], [vu] and [u]. Consonants 
used in CVC syllables, with the same 
consonant preceding and succeeding 
each vowel, were (1) voiced stop- 
plosives [d] and [g], (2) voiced frica- 
tives [v] and [z], (3) voiceless stop- 
plosives [t] and [k] and (4) voiceless 
fricatives [f] and [s]. Each vowel 
was combined with [h] in a CV sylla- 
ble to provide for comparing gradual 
initiation of the vowel with the more 
abrupt initiation of the isolated vowel. 
The total number of syllables was thus 
90. 


Twelve speakers with clinically 
diagnosed harsh voices recorded the 
syllables in an anechoic room using 
high fidelity recording equipment. 
Thirty-two listeners rated the sylla- 
bles for severity of harshness on a 
seven-point equal-appearing intervals 
scale. Median scale values were de- 
rived from their responses. 

On the basis of the obtained re- 
sults the following conclusions appear 
warranted: (1) The order of vowels 
with respect to increasing severity of 
harshness for all environments com- 
bined is [i], [u], [1], [vu], [4], [el, 
[e2], [a], [>] (Not all obtained differ- 
ences between adjacent vowels were 
statistically significant). (2) Severity 
of harshness of vowels varies with 
height of tongue position with higher 
vowels less harsh than lower vowels. 
(3) Differences among vowels with 
respect to severity of harshness vary 
with type of environment: voiced, 
voiceless and initiation. Within a given 
type of environment, however, indi- 
vidual environments have the same 
relative effect. (4) Vowels in voiceless 
consonant and stop-plosive environ- 
ments are less harsh than vowels in 
voiced and in fricative environments. 
(5) Harshness is least severe for voice- 


less stop-plosive environments, more 
severe for voiced stop-plosive and 
voiceless fricative and most severe for 
voiced fricative. (6) Isolated vowels 
are more harsh than corresponding 
CV syllables initiated with [h]. Both 
are more harsh than most other cor- 
responding CVC syllables. 


References 


1. Brack, J. W., Natural frequency, dura- 
tion, and intensity of vowels in read- 
ing. JSHD, 14, 1949, 216-221. 

. Brackett, I. P., A study of the growth 
of inflammation on the vocal folds ac- 
companying easy and harsh produc- 
tion of the voice. Unpublished Master’s 
thesis, Northwestern University, 1937. 

3. Craic, W. C., and Soxotowsky, R. R., 

The Preacher’s Voice. Columbus, Ohio: 
The Warthburg Press, 1945. 


4. Cranpatt, I. B., The sounds of speech. 
Bell Syst. tech. J.. 4, 1925, 586-626. 

5. Curtis, J. F., Disorders of voice. In 
Speech Handicapped School Children, 
Johnson, W. ez al., Rev. ed., New York: 
Harper, 1956. 

6. Drew, R. O., and Kettoce, E. W., 
Starting characteristics of speech sounds. 
J. acoust. Soc. Amer., 12, 1940, 95-103. 

7. Farrpanks, G., Voice and Articulation 
Drill Book. New York: Harper, 1940. 


8. Farsanxs, G., House, A. S., and 
Stevens, E. L., An experimental study 
of vowel intensities. J. acoust. Soc. 
Amer., 22, 1950, 457-459. 

9. Herrner, R.-MS., Notes on the length 
of vowels. Amer. Speech, 12, 1937, 128- 
134. 

10. Housr, A. S., and Fairsanxs, G., The 
influence of consonant environment 
upon the secondary acoustical char- 
acteristics of vowels. J. acoust. Soc. 
Amer., 25, 1953, 105-113. 

11. Joos, M., Acoustic phonetics. Language 
Monograph No. 23, Language, Supple- 
ment, 24, 1948, 1-136. 

12. Kenyon, J. S., American Pronunciation. 
Ann Arbor, Michigan: George Wahr, 
1940. 

13. LeHmann, W. P., and Herrner, R.-M. 
S., Notes on the length of vowels (III). 
Amer. Speech, 15, 1940, 377-380. 

14. LeHmMann, W. P., and Herrner, R.-M. 


nN 





168 





JOURNAL OF SPEECH AND HEARING RESEARCH 


S., Notes on the length of vowels (VI). 
Amer. Speech, 18, 1943, 208-215. 


. Locke, W. N., and Herrner, R.-M.S., 


Notes on the length of vowels (II). 
Amer. Speech, 15, 1940, 74-79. 


. ParMEnTER, C. E., and Trevino, S. N., 


The length of the sounds of a middle 
westerner. Amer. Speech, 10, 1935, 129- 
133. 


. Peterson, G. E., The phonetic value of 


vowels. Language, 27, 1951, 541-553. 


. Peterson, G. E., and Barney, H. L., 


Control methods used in a study of the 
vowels. J. acoust. Soc. Amer., 24, 1952, 
175-184. 


. Rositzxe, H. A., Vowel-length in gen- 


eral American speech. Language, 15, 
1939, 99-109. 


20. 


22. 


25s 


24. 


Sacia, C. F., and Beck, C. J., The power 
of fundamental speech sounds. Bell 
Syst. tech. J., 5, 1926, 393-403. 


. SHERMAN, D., The merits of backward 


playing of connected speech in the 
scaling of voice quality disorders. 
JSHD, 19, 1954, 312-321. 

SHERMAN, D., and Linke, E., The in- 
fluence of certain vowel types on de- 
gree of harsh voice quality. JSHD, 17, 
1952, 401-408. 

Taytor, H. C., The fundamental pitch 
of English vowels. J. exp. Psychol., 16, 
1933, 565-582. 

Tuurstong, L. L., and Cuave, E. J., The 
Measurement of Attitude. Chicago: 


University of Chicago Press, 1929. 














Objective Speech Audiometry: 
A New Method Based On 


Electrodermal Response 


Howard B. Ruhm 


Raymond Carhart 


The technique of electrodermal au- 
diometry is valuable as an objective 
means of determining auditory thresh- 
olds. Bordley, Hardy and Richter (2) 
and Bordley, and Hardy (1) demon- 
strated the application of the method. 
They were able to assess the acuity 
for pure tones possessed by many sub- 
jects who were either unable or un- 
willing to yield reliable voluntary 
thresholds. Particularly notable is the 
manner in which the Johns Hopkins 
group pioneered the use of electro- 
dermal responses in the evaluation of 
hearing in young children. Bordley 
and Haskins (3) later found that the 
technique helps differentiate periph- 
eral presbycusis from central auditory 
disorders of the aged. Several other 
workers, notably Doerfler, Stewart 
and their associates (5, 6, 11, 13, 14), 
have contributed refinements in pro- 
cedure and have enhanced our under- 
standing of the precautions necessary 





Howard B. Ruhm (M.A., University of 
Maryland, 1956) is Research and Clinical 
Assistant, Northwestern University and 
Clinical Audiologist, Veterans Administra- 
tion. Raymond Carhart. (Ph.D., Northwest- 
ern University, 1936) is Professor of Au- 
diology, Northwestern University. This ar- 
ticle is adapted from a paper presented at 
the 1957 Convention of the American 
Speech and Hearing Association in Cin- 
cinnati. 


Volume 1, No. 2 


—— 1 69-—— 


to employ electrodermal audiometry 
effectively. Concurrently, the method- 
ology has been proving most valuable 
in differentiating ‘organic’ involve- 
ments in adults from ‘functional’ 
problems, whether these latter be psy- 
chogenic hearing losses or instances of 
malingering 7, 9). 

Early efforts to use the electroder- 
mal technique to determine the 
threshold for speech failed to yield 
procedures which were dependable. 
The difficulty is exemplified by the 
experience which Knapp and Gold 
(10) recount. They did not attempt 
systematically to establish speech re- 
ception threshold by EDR. However, 
they presented speech stimuli to a 
large group of subjects. Knapp and 
Gold recorded the changes in skin 
resistance and required their subjects 
to report not only when the presence 
of speech was first noted, but also 
when it was first understood. Their 
findings by electrodermal audiometry 
substantiated the voluntary responses 
in the majority of those cases for 
whom a previous diagnosis of organic 
hearing loss had been made. The 
minimal intensity level evoking elec- 
trodermal response deviated markedly 
from the admitted speech threshold 
where a functional involvement was 
present. 


June 1958 








170 JOURNAL OF SPEECH AND HEARING 


The difficulty with an approach 
such as the one taken by Knapp and 
Gold is that the electrodermal re- 
sponse to a particular speech stimu- 
lus is not stable from one subject 
to another nor for the same subject 
at different times. True, a drop in 
skin resistance is often elicited when 
speech materials are heard. However, 
the fact that each test item conveys 
meaning results in the paradoxical 
situation that a given sample of speech 
carries distinctive emotional signifi- 
cance for each listener. A marked 
electrodermal response is to be ex- 
pected if the word or sentence heard 
is affectively important to the person 
under test. Unfortunately, when a re- 
sponse does not appear, the clinician 
can not be certain whether the test 
item was inaudible or was merely un- 
exciting to the subject. 

The commonly chosen solution is 
to select words or sentences calculated 
to evoke strong emotional reactions 
and present these in a sequence which 
is reminiscent of administering a lie- 
detection test. Such a solution fre- 
quently gives a qualitative estimate of 
the threshold for speech, but it does 
not lend itself to the kind of stand- 
ardization which allows quantification 
of results. Electrodermal speech au- 
diometry can not be expected to yield 
precise results unless a method is 
used whereby the linkage of electro- 
dermal response to speech stimuli can 
be controlled by the tester. 

The problem is complicated by the 
fact that tk: electrodermal reaction 
to a particular speech item, although 
initially strong, diminishes with suc- 
cessive presentations of the item. In 
other words, a speech stimulus loses 
its power to evoke an EDR if ad- 
ministered too often. 


A similar difficulty had to be 
solved before pure tones could be 


RESEARCH 


used effectively as the stimulus in 
electrodermal audiometry. As Doer- 
fler (5) has pointed out, successive 
presentations of an audible pure tone 
yield progressively less definitive elec- 
trodermal responses as the sequence 
procedes. In fact, the response usually 
disappears after a few repetitions of 
the pure tone. It is in order to elimi- 
nate such an extinction of reaction 
that a period of ‘conditioning’ with 
electric shock is included as the first 
stage in the conventional procedure 
for determining thresholds for pure 
tones by electrodermal audiometry. 
During this stage the technique is to 
present an audible tone which is fol- 
lowed closely by an electric shock. 
The sequence is repeated sufficiently 
often so that the electrodermal re- 
sponse to the electric shock is trans- 
ferred to the tonal stimulus. A sys- 
tematic exploration for auditory 
threshold is then possible without the 
use of shock, which is now employed 
only occasionally as a means of re- 
enforcing the linkage between au- 
ditory excitation and electrodermal 
reaction. 

The mechanisms which make such 
a ‘conditioning’ procedure effective 
are poorly understood. It still remains 
to be demonstrated whether it is pos- 
sible to achieve classical conditioning 
of a response involving the auto- 
nomic nervous system when the 
sequence of stimulation is as brief as 
the maximum which is practical in 
the clinical application of electroder- 
mal auditometry. Be that as it may, 
effective transfer of the electrodermal 
response from the shock to the. tonal 
stimulus can be achieved in many 
young children under circumstances 
reminiscent of traditional condition- 
ing. On the other hand, transfer is 
achieved with many older children 
and adults very quickly. A conscious 























RUHM AND CARHART: OBJECTIVE SPEECH AUDIOMETRY 175 


apprehension that electric shock may 
follow the tone seems in these in- 
stances to be an important factor. It 
remains for the psychologist and the 
neurophysiologist to answer the ques- 
tion as to whether or not such results 
can properly be classed as phenomena 
of conditioning. 

Meanwhile, the audiologist knows 
he may continue with confidence to 
use the ‘conditioning’ technique to 
re-enforce the electrodermal response 
to an audible pure tone stimulus. True, 
there are patients who do not respond 
to ‘conditioning,’ but this fact does 
not make the method less useful in 
those instances where it is effective. 
The method has proved its clinical 
value through making it feasible to 
determine pure tone thresholds by 
electrodermal audiometry in numer- 
cous cases where conventional audiom- 
etry would fail. 

An analogous procedure of ‘con- 
ditioning’ through associated electric 
shock must be applied to speech au- 
diometry if the clinician is to counter- 
act the tendency of the electrodermal 
response to disappear as the test se- 
quence progresses. However, a major 
limitation appears if one attempts to 
‘condition’ the electrodermal response 
to successively changing speech 
samples, such as a whole series of 
spondees. It is true that a response 
will in consequence occur each time 
a speech stimulus is audible. It is not 
certain, however, that the response is 
anything more than a reaction to 
awareness of the presence of speech. 
Thus, insofar as can be judged in a 
given instance, only the threshold of 
detectability for speech will be re- 
vealed by this method. The current 
need is for a clinically feasible tech- 
nique for obtaining definitive electro- 
dermal responses which will reveal 


the threshold of intelligibility for 
speech. 

Such a method is at hand. The pur- 
pose of this paper is, first, to describe 
the procedure involved and, second, 
to report experimental findings which 
indicate its validity. 


Procedure 


The basic feature of the method is 
that the electrodermal response is 
‘conditioned’ to occur following .a 
single speech item. Concomitantly, 
because of the manner in which the 
conditioning process is carried out, the 
electrodermal response becomes re- 
duced whenever any other speech 
items are perceived. That is, the in- 
itial phase of the technique consists 
of presenting a series of speech stim- 
uli which are audible to the subject. 
Spondaic words are excellent for this 
purpose. One word is selected as the 
key item. This word is always fol- 
lowed by electric shock, which is 
omitted whenever any other speech 
item is administered. If this method 
is properly carried out, a subject with 
appropriate linguistic sophistication 
can be ‘conditioned’ so that the key 
word elicits the electrodermal re- 
sponse while all other words are neu- 
tral in this respect. It is then a simple 
matter to establish the threshold for 
intelligibility by interspersing pres- 
entations of the key word with pres- 
entations of neutral items. The stim- 
ulus level is concomitantly varied un- 
til the minimum intensity at which 
the key word elicits the electroder- 
mal response is found. This level is 
the threshold of intelligibility for the 
key word. Confirmation that this 
threshold represents a more general 
threshold of intelligibility for speech 
is provided by the fact that both the 
neutral stimuli and the key word 








172 


JOURNAL OF SPEECH AND HEARING 


elicit electrodermal responses if they 
are presented just under this level. At 
this point, however, all responses are 
essentially equal and are of a lesser 
magnitude than is the supra-threshold 
response to the key word. This re- 
duced electrodermal response to all 
speech stimuli indicates that the sub- 
ject can detect the presence of speech, 
but that he cannot differentiate 
whether or not the item is the key 
word. A further drop in presentation 
level eliminates all electrodermal re- 
sponses, revealing that the speech 
items are no longer even detectable. 

Obviously the phenomena just de- 
scribed will be revealed only by cases 
possessing sufficient linguistic skill and 
maturity so that they achieve per- 
ceptual differentiation among speech 
stimuli. It follows that this technique 
of objective speech audiometry is 
limited in its clinical application to 
such patients. Nonetheless, the meth- 
odology is of great value when ap- 
propriately employed, as, for example, 
to delineate the speech thresholds of 
adults exhibiting ‘non-organic’ hear- 
ing losses. ‘The usefulness of such an 
application has been demonstrated by 
Ruhm and Menzel (2). 

Before concluding that such ap- 
plications are justifiable, however, it 
was necessary to determine whether 
the procedure does, in fact, designate 
the threshold for speech reception and, 
if so, to estimate its accuracy as a 
measuring tool. These questions were 
investigated by administering a series 
of three speech reception tests to each 
of 40 subjects, 20 with normal hear- 
ing and 20 with conductive hearing 
loss. Two of the tests yielded volun- 
tary threshold for speech while the 
third employed the electrodermal 
method. The adequacy of the elec- 
trodermal results was assessed by sta- 
tistical analyses of the agreements 
among the three tests. 





RESEARCH 


All tests were administered in a 
clinical suite with acoustical char- 
acteristics appropriate for determina- 
tion of speech reception threshold by 
monitored live-voice. The control 
room contained a custom-built speech 
audiometer and a Grason-Stadler psy- 
chogalvanometer, Model E664. The 
testing room contained a set of PDR- 
8 earphones, which were connected 
to the output of the speech audiom- 
eter, and the microphone for the 
subject’s talk-back system. The ter- 
minal attachments from the psycho- 
galvanometer were also in this second 
room. These attachments included the 
two zinc electrodes for transmitting 
shock, the two zinc electrodes for de- 
tecting changes in skin resistance, and 
the steel grounding plate used to iso- 
late the measuring electrodes elec- 
trically from the shock electrodes. 
These electrodes and grounding plate 
were attached to the subject in the 
conventional manner before begin- 
ning speech audiometry. Conventional 
speech thresholds could, of course, be 
obtained either with or without the 
electrodes affixed to the subject. 

The electrodermal determination of 
speech threshold was carried out 
monaurally in the following manner. 
After the subject had been seated 
comfortably, the electrodes were ap- 
plied in the usual way. The subject 
was told that he would hear a num- 
ber of words which he was not to 
repeat. He was also instructed to ex- 
pect an electric shock at certain times 
during the test. The examiner then 
proceeded to present spondee words 
by monitored live-voice. The spon- 
dees used were the words which 
Hirsh et al. (8) selected for the W1 
and W2 recordings. These particular 
words were chosen because of their 
demonstrated usefulness in measuring 
speech reception threshold, because 
of the homogeneity and familiarity of 




















RUHM AND CARHART: OBJECTIVE SPEECH AUDIOMETRY 173 


the items involved, and because of 
their ready availability to the clin- 
ician. One word, ‘baseball,’ was ar- 
bitrarily designated as the key word 
to which the unconditioned stimulus 
was to be linked. Preliminary trial 
had shown that any one of the 36 
spondees might have been selected for 
this purpose. The spondees were pre- 
sented at intervals ranging from five 
to 20 seconds. Both the stimulus se- 
quence and the intervals were ran- 
domized, except that each group of 
four items included one presentation 
of ‘baseball.’ During the process of 
‘conditioning,’ the electric shock fol- 
lowed this word consistently but was 
not given in association with any 
other words. 


Test items were initially presented 
at a clearly audible level, that is, 70 
db re normal threshold for subjects 
without appreciable hearing loss. The 
presentation was continued until a 
distinctive electrodermal response was 
firmly ‘conditioned’ to the key word, 
‘baseball.’ The requirements for dis- 
tinctiveness were (1) that the am- 
plitude of stylus deflection following 
the key word must be at least 10 mil- 
limeters greater than the largest de- 
flection following any neutral word 
at a given sensation level and (2) 
that the rate of stylus deflection must 
be at least two millimeters per second 
from the time of onset of the con- 
ditioned stimulus. In order that a re- 
sponse to the key word might be re- 
corded without being contaminated 
by too early a reaction to the shock, 
the latter was not presented until the 
stylus deflection following a_ key 
word presentation had reached its 
maximum amplitude. It was originally 
thought that this procedure might 
result in an unusually long latency of 
response. However, there was no evi- 
dence of such an effect. ‘Condition- 


ing’ was considered to be established 
when three consecutive responses sat- 
isfying the above criteria had been 
elicited. 


After establishment of ‘condition- 
ing,’ the speech audiometer was set 
for maximum attenuation, and the 
speech reception threshold was meas- 
ured. This was accomplished by an in- 
itial ascent in intensity to the level 
where differential electrodermal re- 
sponses were again obtained. Next, al- 
ternating descents and ascents were 
carried out in two-db steps in order 
to cross the threshold four times in 
each direction. The speech reception 
threshold was considered established 
at the lowest level where acceptable 
differential responses were obtained at 
least 50 per cent of the time. Ap- 
proximately 22 minutes were required 
for this procedure. The required time 
can be decreased by using larger steps 
initially and decreasing their size 
rapidly in order to ‘bracket’ the 
speech reception threshold. 

As mentioned earlier, two types of 
data were gathered in order to assess 
the validity of the method: (1) speech 
thresholds obtained non-voluntarily 
by electrodermal audiometry and (2) 
voluntary speech thresholds deter- 
mined in the conventional manner. 

Each subject underwent the same 
routine. One clinician measured the 
voluntary threshold for spondaic 
words by the conventional live-voice 
technique; that is, the subject repeated 
the words he heard while the clinician 
varied their level until the threshold 
of intelligibility was reached. Soon 
thereafter another clinician deter- 
mined the speech threshold by elec- 
trodermal audiometry. Finally, a sec- 
ond voluntary threshold was procured 
by a third clinician. All tests were 
given on the same day, and care was 
taken to guarantee that each partici- 





174 JOURNAL OF SPEECH AND HEARING RESEARCH 


TaB.E 1. Individual speech reception thresholds yielded by 20 subjects without hearing loss. 











Subject Hearing Level (in db) 
First Voluntary Electrodermal Second Voluntary 
Threshold Threshold Threshold 

1 -2 -2 -2 

2 2 -2 -4 

3 -2 -2 -2 

4 -2 -2 -2 

5 6 2 4 

6 2 2 2 

7 0 0 

8 0 2 -2 

9 +4 ~4 -4 
10 2 0 0 
11 0 -2 -2 
12 —4 -4 -4 
13 0 -2 2 
14 -2 4 -2 
15 -2 6 | 
16 10 6 6 
Wy 2 —-+ -2 
18 4 4 4 
19 2 -4 4 
20 -2 -2 2 

Mean 0.1 -1.4 -1.1 
S. D. 3.4 2.9 2.9 








pating clinician was kept in ignorance 
of the results obtained by the other 
two. 

The validity of electrodermal 
speech audiometry, when performed 
according to the foregoing method, 
was assessed by administering the fore- 
going sequence of tests to two groups 
of subjects. The first group consisted 
of 20 young adults with normal 
hearing. These persons were North- 
western University students. Fourteen 
were men and six were women. They 
ranged in age from 19 to 28 years. 
The second group consisted of 12 men 
and eight women with conductive 
hearing losses ranging between 30 and 
60 db as determined by conventional 
pure tone and speech audiometry. 
‘Two additional cases with conductive 
impairment were eliminated from the 
study because of their failure to yield 


‘conditioning’ of the electrodermal 
skin response. These subjects with 
impaired hearing were volunteers 
from the Chicago area on whom defin- 
itive otological diagnoses were avail- 
able. They ranged from 19 to 80 years 
in age. 


Results and Discussion 


The three sets of thresholds yielded 
by the 20 normals are reported in 
Table 1. Several facts and relation- 
ships are immediately apparent. The 
group possessed highly homogeneous 
hearing, as indicated by the finding 
that the range of thresholds was only 
14 db. Secondly, the voluntary thresh- 
olds were very reliable. Not only does 
the mean threshold for the second 
test (-1.1 db) deviate only 1.2 db 


























RUHM AND CARHART: OBJECTIVE SPEECH AUDIOMETRY 175 


TaBLeE 2. Individual speech reception thresholds yielded by 20 subjects with conductive hearing loss. 











Subject Hearing Level (in db) 
First Voluntary Electrodermal Second Voluntary 
Threshold Threshold Threshold 

1 54 48 48 

2 54 50 50 

3 44 42 42 

4 50 50 52 

5 46 44 46 

6 44 44 42 

a 58 60 60 

8 38 36 38 

9 52 50 50 
10 48 48 48 
tk 46 42 44 
12 36 36 36 
13 50 48 48 
14 42 42 42 
15 36 32 32 
16 36 36 36 
17 34 34 32 
18 54 52 50 
19 48 46 46 
20 46 46 46 

Mean 45.8 44.3 44.4 
8.D. 6.7 6.8 6.9 








from the mean threshold of the first 
test (0.1 db), but 10 subjects obtained 
identical thresholds on the two tests 
and only two subjects showed dis- 
crepancies of four db. The coefficient 
of correlation is high, .92 (Pearson 
product-moment), even though the 
range of measures is quite restricted. 
Finally, it is noteworthy that the 
speech reception thresholds obtained 
by electrodermal audiometry are in as 
close agreement with the two sets of 
voluntary thresholds as these latter are 


with one another. The mean electro- 
dermal threshold (-1.4 db) is only 1.5 
db from the mean for the first vol- 
untary (0.1 db) and only 0.3 db from 
the mean of the second voluntary 
(-1.1 db). The coefficients of cor- 
relation are .85 and .94, respectively. 
Moreover, 15 of the electrodermal 
thresholds were identical with the 
second voluntary thresholds, and the 
discrepancy was only two db in the 
remaining five cases. Thus, electro- 
dermal audiometry, when evaluated 


Taste 3. Error of predictability of voluntary and electrodermal thresholds for speech. 








Prediction 


Standard Error of Estimate (in db) 
Normal Group Conductive Group 





Electrodermal from First Voluntary 
Electrodermal from Second Voluntary 
First Voluntary from Electrodermal 
Second Voluntary from Electrodermal 
Second Voluntary from First Voluntary 
First \ oluntary from Second Voluntary 


1.92 
1.91 
1.37 
1.36 
1.67 
1.67 


9 oe a 
BNRBSFsS 














176 JOURNAL OF SPEECH AND HEARING RESEARCH 


by comparison with voluntary thresh- 
olds, emerged as a highly valid 
method for determining the speech 
thresholds of the 20 normal hearers 
under consideration here. 


The analogous data for the 20 cases 
with conductive hearing loss are re- 
ported in Table 2. As would be ex- 
pected, the range of thresholds (28 
db) is greater than for the group of 
normals. However, the reliability 
of the voluntary thresholds proved 
essentially as good. The mean hearing 
loss revealed by the initial test was 
45.8 db, while the mean for the sec- 
ond test was 44.4 db. One subject 
showed a discrepancy between tests 
of six db and three yielded discrep- 
ancies of four db, while seven ob- 
tained identical scores. The coeffi- 
cient for the correlation between the 
two sets of results is .96. Moreover, 
the electrodermal speech thresholds 
agreed as well with the voluntary 
thresholds as the latter did with one 
another. The mean hearing loss by 
electrodermal audiometry was found 
to be 44.3 db, which is within 0.1 db 
of the mean for the second set of vol- 
untary measurements. Here, there 
were seven cases where the discrep- 
ancy between these two thresholds 
was two db, while identical scores 
were yielded by the other 13 sub- 
jects. The coefficient of correlation 
was found to be .98. The electroder- 
mal results do not coincide quite so 
closely with the first voluntary 
thresholds, but even here the correla- 
tion proved to be .96. Thus, with 
conductive losses, too, electrodermal 
speech audiometry was found to be 
valid, if evaluated by comparison with 
voluntary thresholds. 

The noteworthy equivalence and 
high interdependence among the 
three sets of tests are summarized dif- 
ferently in Table 3. This table re- 


ports the standard errors of estimate 
which were calculated to determine 
the precision with which the electro- 
dermal speech test results were pre- 
dictive of the voluntarily obtained 
thresholds, as well as the reverse. 
These standard errors were smaller 
for the group of normals than for the 
group of conductives, but even in the 
latter instance, they were all less than 
two db. Furthermore, the predictive 
precision between the electrodermal 
test and the second voluntary test (in 
either direction) was better for both 
groups than was the predictive pre- 
cision between the two voluntary 
tests. Viewed statistically, the results 
for electrodermal speech audiometry 
were, in the present study, indis- 
tinguishable from the results for vol- 
untary speech audiometry. 

Further confirmatory evidence 
emerges when the speech data for the 
conductive loss group are compared 
to the data on hearing for pure tones.’ 
For one thing, the mean pure tone 
loss for the group, averaged for the 
500-2000 cycle band, was 46.3 db. 
This mean is very close to the values 
of 45.8, 44.4 and 44.3 db obtained for 
the two voluntary speech tests and 
electrodermal speech audiometry, re- 
spectively. Second, the standard de- 
viation for the pure tone averages was 
6.8 db, which is essentially identical 
with the companion statistics for the 
three sets of speech measurements. 
Lastly, the correlation between the 
500-2000 cps averages and the speech 
reception thresholds proved to be 
very good in comparison to the usual 
results for clinical data (4). The co- 
efficients of correlation between pure 
tone data and the two sets of volun- 
tary speech thresholds were .89 and 


*The pure tone data were the clinical 
audiograms on hand when the subjects were 
selected for the study. 

















RUHM AND CARHART: OBJECTIVE SPEECH AUDIOMETRY 177 


RESPONSE TO DIFFERENT STIMULUS LEVELS 









eee 
. 











! ! ! ! 
20 40 60 80 


TIME IN SECONDS 





! 
100—s« 0 


Ficure 1. Tracing illustrating electrodermal 
response to different levels of speech stim- 
uli: A, key word perceived; B and C, 
neutral words perceived; D, key word de- 
tected but not perceived; E, neutral word 
detected but not perceived; F, key 
word not detected; G, neutral word not 
detected. 


.85, respectively. The correlation with 
the speech thresholds obtained by the 
electrodermal method was .86. Such 
findings add to the confidence with 
which the results of the present study 
may be accepted. 


A final point warrants attention. As 
stated earlier, the electrodermal re- 
sponses exhibited by a subject are of 
two types. The two are easily dif- 
ferentiated in the chart which the 
electrodermal apparatus yields. This 
chart, when properly interpreted, al- 
lows the clinician to distinguish the 
threshold of intelligibility from the 
threshold of detectability for speech. 


Figure 1 exemplifies this situation 
with a sample record taken from a 
subject’s chart. After ‘conditioning’ 
had been achieved, the response to the 
key word (in this instance, ‘baseball’) 
was sharp and definitive as long as the 
word remained intelligible. This type 
of response is illustrated by A in the 
figure. Concomitantly, no appreciable 
galvanic response was elicited by 
other spondees (see B and C). By con- 
trast, when speech was reduced to 


the level where perceptibility was lost 
yet where detectability was retained, 
an electrodermal response of inter- 
mediate magnitude was elicted by all 
spondees. In Figure 1, D is such a re- 
action when the stimulus was ‘base- 
ball? and E is one when the stimulus 
was a neutral spondee. Of course, as 
exemplified by F and G in the figure, 
a sufficient additional reduction in 
signal intensity eliminated all re- 
sponses. 

The fact that electrodermal re- 
actions differentiate themselves in the 
manner just described allows the clin- 
ician to distinguish two boundaries of 
hearing for speech: (1) the boundary 
where understanding of the stimulus 
is replaced by simple detection of 
speech, and (2) the boundary beyond 
which awareness of the presence of 
speech is also lost. 


Summary 


An effective technique for the 
measurement of the speech reception 
threshold by means of the electro- 
dermal response is described. The 
basic feature of the technique is that 
electric shock is used to ‘condition’ 
the electrodermal response to a single 
speech item, the key stimulus. This 
stimulus is interspersed randomly with 
other speech stimuli. When properly 
‘conditioned,’ a strong electrodermal 
reaction occurs whenever the key 
stimulus is heard. In contrast, neutral 
speech stimuli fail to elicit vigorous 
electrodermal changes. 

This method proved highly valid 
when evaluated on 20 subjects with 
normal hearing and on 20 subjects 
with moderate conductive losses. The 
criterion of validity was agreement 
between the voluntary and electro- 
dermal thresholds for speech obtained 
from these subjects. The results by 





178 





JOURNAL OF SPEECH AND HEARING RESEARCH 


electrodermal audiometry were indis- 
tinguishable statistically from the re- 
sults for the voluntary method, and 
the consistency of consecutive thresh- 
olds for a single subject was excellent. 


References 


1. 


Borpiey, J. E., and Harpy, W. G., A 
study in objective audiometry with the 
use of a psychogalvanometric response, 
Ann. Otol., St. Louis, 58, 1949, 751-759. 


. Borptey, J. E., Harpy, W. G., and 


Ricuter, C. P., Audiometry with the 
use of galvanic skin-resistance response, 
Johns Hopk. Hosp. Bull., 82, 1948, 569. 


. Borptey, J. E., and Haskins, H. L., The 


role of the cerebrum in hearing, Ann. 
Otol., St. Louis, 64, 1955, 370-382. 


. Carnart, R., Monitored live-voice as 


a test of auditory acuity, J. acoust. Soc. 
Amer., 17, 1946, 339-349. 


. Dorrrter, L. G., Neurophysiological 


clues to auditory acuity, JSHD, 13, 
1948, 227-232. 


. Doerrier, L. G., and McCture, C. T., 


The measurement of hearing loss in 


10. 


11. 


12. 


adults by galvanic skin 


response, 
JSHD, 19, 1954, 184-189. 


. Hetzer, M. F., et al., Functional Otol- 


ogy, The Practice of Audiology. New 
York: Springer Pub. Co., Inc., 1955. 


. Hirscn, I. J., et al., Development of 


materials for speech audiometry, JSHD, 
1952, 321-337. 


. Jounson, K. O., Worx, W. P., and 


McCoy, G., Functional deafness, Ann. 
Otol., St. Louis, 65, 1956, 154-170. 
Knapp, P. H., and Gorn, B. H., The 
galvanic skin response and diagnosis of 
hearing disorders, Psychosom. Med., 
12, 1950, 6-22. 

Meritser, C. L., and Doerrter, L. G., 
The conditioned galvanic skin response 
under two modes of reinforcement, 
JSHD, 19, 1954, 350-359. 

RuuM, H. B., and Menzet, O. J., Ob- 
jective speech audiometry in cases of 
functional hearing loss (in prepara- 
tion). 


. Stewart, K. C., A new instrument for 


detecting the galvanic skin response, 
JSHD, 19, 1954, 169-173. 


. Stewart, K. C., Some basic considera- 


tions in applying the GSR technique to 
the measurement of auditory sensitivity, 
JSHD, 19, 1954, 174-183. 

















Stapedolysis (Stapes Mobilization) 
And The Nomograph Technic 


Victor Goodhill 


The term ‘stapedolysis’ is used to 
designate a procedure aimed at ‘lysis’ 
or freeing of adhesions around the 
stapes. The purpose of the present ar- 
ticle is to discuss and to evaluate the 
nomograph technic and stapedolysis 
(stapes mobilization) which is the ob- 
jective of the surgical procedure. 

That ‘ankylosis’ of the stapes foot- 
plate can produce a conductive type 
of deafness has been known for over 
a century. Kessel (1/3), in 1876, began 
the surgical saga of attempts at lysis 
of the fixed stapedial footplate. He 
was followed by many otologists who 
sought surgical techniques to reopen 
the closed pathway of perilymphatic 
sound conduction to the basilar mem- 
brane. Blake (1), Jack (12), Miot 
(16), Holmgren (J1) and many others 
were identified with attempts at re- 
creation of an acoustically mobile 
perilymphatic pathway to the organ 
of Corti. 

Most of the attempts of these early 
surgical investigators apparently re- 
sulted in failures, and whatever the 
causes were, the surgical approach in 





Victor Goodhill (M.D., University of 
Southern California, 1937) is Clinical Pro- 
fessor of Surgery (Otology), University of 
Southern California School of Medicine. 
He is also Senior Attending Otologist, 
Cedars of Lebanon Hospital, Los Angeles 
County Hospital, and Childrens Hospital, 
as well as Director of Deafness Research 
Laboratory at Childrens Hospital, Los An- 
geles. 


Volume 1, No. 2° 


—179-— 


otosclerosis was transferred to the 
vestibular labyrinth shortly after the 
turn of the century. Early failures in 
this new story of fenestration surgery 
were also discouraging. The research 
of Holmgren (11), the persistance of 
Sourdille (20) and the final brilliant 
technical achievements of Lempert 
(14) brought to surgery for the first 
time a practical method for restora- 
tion of hearing in otosclerosis by fe- 
nestration of the external semicircular 
canal. 

Nevertheless, the possibility of di- 
rect approach to the ankylosed 
stapes continued to intrigue some 
otologists. Rosen’s (18, 19) reports on 
stapes mobilization have been most 
encouraging, and have again turned 
the interests of otologists back to the 
stapedio-vestibular articulation. Re- 
cent surgical experiences in this field 
(3, 4, 5, 6, 7) have suggested a technic 
which appears advantageous, and 
which has yielded encouraging pri- 
mary results in both ideal and fair 
candidates for surgical intervention in 
‘otosclerosis.’ 

In this article, the term ‘otosclero- 
sis’ is used in the context of ‘clinical 
otosclerosis, a term frequently en- 
countered in fenestration literature. 
This term is conceived as describing 
footplate ankylosis, probably by a 
variety of pathologic states, only one 
of which is the typical ‘histologic 
otosclerosis’ as described by Witt- 


June 1958 








i80 


maack (2/), Guild (/0), Lempert and 
Wolff (15), Nager (17), Guggenheim 
(9) and many others. Clinical evi- 
dence is accumulating which points to 
the probable existence of lesions other 
than the classic ‘herde’ of histologic 
otosclerosis, in the production of 
footplate ankylosis. 


It is quite possible that the natural 
history of this disease is characterized 
first by an ‘otospongiosis’ and that 
the ‘otosclerosis’ represents a healing 
phase. It is probable, therefore, 
that the healing phase and its osteo- 
genesis is actually responsible for the 
deafness. Otosclerotic disruption of 
the stapedio-vestibular joint and ‘an- 
kylosis’ of the stapes is the primary 
lesion. 

Otosclerotic invasion and stiffening 
of the round window membrane may 
occur, and otosclerosis may also in- 
volve the cochlea itself, frequently in 
the spiral ganglion. There is no valid 
treatment for the disease itself. How- 
ever, the conductive deafness it pro- 
duces may be alleviated surgically to 
a great degree. 


Surgical Physiology—Stapedolysis 
and Fenestration Differences 


Extocnal canal widened 
mastoidectomy 


empuila of lateral 
semicircular canal 
with new fenestra 






facial nerve 







\ 
round * 


tymponomectal 
window ‘ 


membrane 
Ficure 1. Schematic representation of new 
acoustic pathway via fenestration in case of 
stapedial footplate rigidly ankylosed by 
otosclerosis and not mobilizable by stapedo- 
lysis. Note deformed tympanic membrane, 
and absent incus and malleus head. 


JOURNAL OF SPEECH AND HEARING RESEARCH 






















FREQUENCY 
° ° ° ° 
° ° ° 
a oe 





NORMAL 
10 
20 
30 
40 
50 
60 
70 
80 
90 
100 


X— B.C. 


@— PREOP. A.C. @ — POSTOP. A.C. 


Ficure 2. Illustration of pre-operative and 
post-operative air conduction thresholds in 
a successful fenestration operation. 


The direct approach to the an- 
kylosed stapedial footplate is trans- 
tympanic and primarily peribasal. Oc- 
casionally a trans-incudal or trans- 


FREQUENCY 
° ° 
3 ° 
3 ° 
= N 


















500 





NORMAL 
10 
20 
30 
40 
50 
60 
70 
80 
90 
100 





— B.C. @— PREOP AC. @—POSTOP A.C. 


Ficure 3. Illustration of pre-operative and 
post-operative air conduction thresholds in 
a successful stapedolysis procedure. 




















GOODHILL: STAPEDOLYSIS AND NOMOGRAPH TECHNIC 18! 


capitular approach may be used, es- 
pecially in very early or mild an- 
kylosis cases. 

In fenestration (Figure 1) the dis- 
eased area is bypassed. The fenestra- 
tion operation requires the following 
steps: (1) endaural modified radical 
mastoidectomy; (2) removal of incus 
and head of malleus; (3) trephine 
opening into vestibular perilymphatic 
labyrinth, via horizontal semicircular 
canal; and (4) plastic skin flap—tym- 
panic membrane maneuver to seal the 
perilymph space. Through these steps, 
a new window is created to take the 
place of the non-functioning oval 
window, thus providing the necessary 
perilymph mobility on either side of 
the basilar membrane. An acoustic 
bypass has thus been created by a 
back-door operation. 

Successful stapedolysis, by utilizing 
the undisturbed natural acoustic path- 
way, should yield more efficient 
acoustic transmission. That this is fre- 
quently (but not always) true is seen 
in a comparison of audiograms of a 
patient who has a fenestration in one 
ear (Figure 2) and a stapedolysis in 
the other (Figure 3). 

When the stapedolysis procedure is 





Ficure 4. Posterior omega incision. 





Elevation and enucleation of 


Figure 5. 
canal-tympanic flap. 


successful, the advantages to the pa- 
tient are important. Because of the 
preservation of the functioning tym- 
panic membrane and ossicular chain, 
the hearing restoration is usually ex- 
cellent. The surgical procedure is 
more benign for the patient in view of 
short hospitalization, simple and brief 
post-operative course and relative ab- 
sence of side effects and discomfort. 


Procedure 


Pre-operative medication consists of 
mild barbiturate sedation, in order to 
permit cooperation in surgical au- 
diometry. Local anesthesia is pro- 
duced by xylocaine meatal block. 


Surgical Technic. The surgical pro- 
cedure is performed through an ear 
speculum. An omega shaped incision 
(Figure 4) is made on the posterior 
canal wall from the annulus out 
to the bony cartilaginous junction 
through skin and periosteum. Dermal 
periosteal flap (Figure 5) is elevated 
to the tympanic sulcus. Fibrous an- 
nulus of tympanic membrane is dis- 
sected out of suicus. Dermal tympanic 
flap is dissected anteriorly to expose 








182 JOURNAL OF SPEECH AND HEARING RESEARCH 


posterior half of tympanic cavity. In- 
cudo-stapedial joint, chorda tympani 
nerve, stapedial tendon, round win- 
dow niche, and promontory are now 
visualized (Figure 6). 

If the incudo-stapedial joint is lo- 
cated in a postero-superior direction, 
it will be necessary to remove bone 
from the annulus to obtain adequate 
ye er This is done by the use of 
a fine sharp curet, with the occasional 
assistance of a specially angulated den- 
tal polishing burr (Figure 7). When 
exposure is adequate, attempts at lysis 
Ficure 6. Palpation of incudostapedial joint. of fixed footplate are begun. All of 
these attempts are under surgical au- 
diometric nomograph control, which 
will be described later. First lysis at- 
tempt is made with the needle probe 
(Figure 6) engaged within the perios- 
teum of lenticular process of the incus. 
This is an exploratory, gentle manipu- 
lation. Force is applied in a medial 
direction, with transmission through 
stapedial capitulum in a_ rotary 
fashion. The direction is then varied, 
depending upon digital resistance en- 
countered. Frequently lysis is accom- 
plished by this first maneuver. 

If preliminary palpation with a 
needle probe reveals a loose incudo- 
stapedial joint, elastic crura, and a 
marked footplate ankylosis, no force 











Figure 9. Bipod lysis. This procedure is 
Ficure 8. Peribasal anterior stripping. preferred. 

















GOODHILL: STAPEDOLYSIS AND NOMOGRAPH TECHNIC _ 183 





Figure 10. Monopod lysis. This procedure 
is used if necessary. 





Ficure 11. Air conduction receiver being 
enclosed in sterile sleeve. 





Ficure 12. Placement of air conduction re- 
ceiver over speculum within canal. 


is used on the joint at all. A direct 
peribasal approach is immediately 
made (this is true in the majority of 
cases). The muco-periosteum of the 
anterior inferior aspects of the foot- 
plate region is elevated to expose the 
otosclerotic lesion (Figure 8), and an 
attempt is made to break through the 
otosclerotic lesion in a circumferential 
manner, preserving inviolate the con- 
tinuity of the stapes (bipod lysis) as 
in Figure 9. If there is evidence of a 
posterior lesion as well as an anterior 
lesion, peribasal fracture through that 
area is performed, also. If there. is ir- 
reversible ankylosis and it is impos- 
sible to break through the lesion cir- 
cumferentially, an anterior crurotomy 
is performed along with a fracture of 
the footplate immediately posterior to 
the lesion, resulting in a monopod 
lysis (Figure 10). When adequate 
lysis has been accomplished, as deter- 
mined by surgical audiometric nomo- 
graph evidence, the dermal-tympanic 
flap is accurately replaced and the 
canal packed with rayon and cellu- 
lose sponge. 


Surgical Audiometry and the No- 
mograph. The surgical audiometric 
nomograph is an integral component 
of the technic (7). Surgical audiom- 
etry is done on the operating table, 
using the air conduction receiver of a 
properly calibrated pure tone audi- 
ometer. The receiver and cord are en- 
closed in a sterile cotton sleeve (Fig- 
ure 11) and placed directly upon the 
ear speculum (Figure 12). The slight 
additional impedance factor is a con- 
stant and can be cancelled cut in the 
audiometric calculations. 

Three frequencies are used for 
threshold determinations: 500, 1000 
and 2000 cps. A minimum of four 
basic steps is required: Step 1, ob- 
taining the basic first threshold with 








184 JOURNAL OF SPEECH AND HEARING RESEARCH 


the middle ear closed, but after the 
preliminary dissection has been com- 
pleted and the flap temporarily re- 
placed; Step 2, obtaining the threshold 
with the middle ear open but prior 
to lysis attempts; Step 3, obtaining the 
threshold with middle ear still open 
after lysis has been attempted (This 
‘© the most crucial of the first three 
steps and is usually repeated several 
times as lysis maneuvers are varied. It 
is compared with Step 2 for evidence 
of significant threshold shift.); Step 
4, obtaining the final threshold with 
middle ear closed again. The final 
step is confirmatory and is compared 
with Step 1. It is very important, 
however, in that it gives information 
regarding the adequacy of the imped- 
ance matching mechanism of the tym- 
panic membrane ossicular chain. 


Because the bone-air gap is highly 
variable in individual cases, it is ob- 
vious that no set decibel threshold 
shift can be utilized as a guide, but 
some ratio must be available for pre- 
dictive reasoning during the surgical 
procedure. Recently, therefore, some 
changes in surgical audiometry were 
introduced. To obtain an average 
threshold amplitude for comparative 
purposes, the ‘equivalent speech re- 
ception threshold’ advocated by 


Fletcher (2) is used. This formula se- 





Figure 13. Illustration of the pre-operative, 
surgical, and post-operative thresholds of 
surgical audiometric technic. 


= 
< 






SURGERY 
De e284 


PREOP AIR 
11/10/55 
7/24/56 
3/12/57 





NORMAL 
10 
20 
30 
40 
50 
60 
70 
80 
90 
100 
Ficure 14. The nomographic representation 
during and after successful stapedolysis sur- 
gery. The post-operative test values are 


expressed in equivalent SRT’s derived from 
the Fletcher formula. 


lects the two best responses of the 
500, 1000, and 2000 cps thresholds and 
averages these responses. This ‘equiva- 
lent SRT” is the single ‘figure of 
merit’ for easy comparison of the dif- 
ferent surgical steps. The Fletcher 
equivalent SRT has been found quite 
comparable with SRT’s obtained from 
conventional spondee word lists. 

In order to obtain a reasonably re- 
liable method of surgical guidance, a 
surgical audiometric nomograph, util- 
izing the Fletcher formula, has been 
devised. This nomograph uses the 
single equivalent SRT figure for bone 
conduction, for pre-operative air con- 
duction, for the four steps of surgical 
audiometry, and for post-operative air 
conduction two weeks following sur- 
gery. The nomograph is constructed as 
illustrated in Figures 13 and 14. The 
ordinate (vertical axis) of the graph 
is divided and numbered in accord- 
ance with the familiar decibel ratio 
with respect to the normal audio- 
metric zero level. The abscissa (hori- 
zontal axis) divisions are made equal 

















GOODHILL: STAPEDOLYSIS AND NOMOGRAPH TECHNIC 185 


. BONE LEVEL 


7 


Figure 15. Nomographic evidence of in- 
adequate lysis. 


to the ordinate divisions even though 
they represent only the discrete steps 
in audiometry. The pre-operative 
bone conduction is shown by a dashed 
horizontal line at the proper ordinate 
value, using the one equivalent SRT 
figure of merit. This establishes the 
surgical objective or the expected air 
conduction level if the operation is to 
be an unqualified success by the pro- 
posed standard. The abscissa consists 
basically of six positions, which are 
used to indicate first the pre-operative 
air conduction level, then the four 
surgical audiometric steps, and finally 
the position of post-operative air con- 
duction level which will be charted 
two weeks following surgery. Where 
more than one test is made at Step 3, 
the results are shown on the same 
abscissa line labeled (a), (b), (c), 
etc., the last one being the significant 
figure. Step 4 may be treated in 
the same manner, when necessary. 





The crucial stage in this operative 
approach is Step 3, where the 
surgeon has already applied some de- 
gree of force and perhaps some degree 
of lysis of the footplate ankylosis. It 
is at this point that a decision must 
be made as to whether to terminate 
the procedure or continue further. 
Thus, a minimal improvement in hear- 
ing might erroneously lead the sur- 
geon to consider that his efforts have 
been adequate (Figure 15) and he 
may conclude the procedure without 
attaining the ideal gain which could 
have been expected in this particular 
patient. On the other hand, excessive 
applications of force at this point 
without adequate audiometric control 
may be disastrous by destruction of 
one or both crura, permanent disloca- 
tion of the incudo-stapedial joint, or 
inadvertent footplate fenestration. 

The nomographic technic subserves 
several surgical functions as an in- 
valuable audiologic aid. Its chief 
function is determination of adequacy 
of lysis, as described above. The uti- 
lization of predictive reasoning is thus 
made possible at critical Step 3 
for ascertaining the adequacy of the 
lysis force exerted. The average gain 
to be expected between Step 2 and 
the ideal post-operative air conduc- 
tion level can be empirically divided 
into three parts. This is, in general, 
true, whether the required total im- 
provement is 15 or 45 decibels, or any 
step in between. Thus, where the co- 
ordinates are equally spaced, a straight 
line diagonally drawn from the inter- 
section of the pre-operative bone con- 
duction level (the predicted objective 
as a post-operative air conduction 
level) to the surgical Step 2 will be 
divided into three equal parts by the 
intersection of the abscissae of sur- 
gical Steps 3 and 4. Therefore, re- 
gardless of whether the bone conduc- 








186 JOURNAL OF SPEECH AND HEARING RESEARCH 


tion threshold is high or low, and re- 
gardless of the threshold of Step 2, 
this straight diagonal line which can 
be dotted in at the time of surgery 
is an excellent indication of the ideal 
expectation for surgical Steps 3 and 
4. Thus, following lysis maneuvers, if 
Step 3 falls near the line or above it 
(that is, to its left), it is probable 
that Step 4 and the post-operative air 
conduction will follow suit, provided 
there are no operative or post-opera- 
tive complications. 

The reason that the expected gains 
between Steps 3 and 4, and between 
Step 4 and post-operative air conduc- 
tion level are roughly equal to the 
gain between Steps 2 and 3 is due to 
the fact that the efficiency of the ear 
drum—ossicular chain mechanism as 
an acoustic transducer is greatest 
when working into the load of a nor- 
mally mobile stapes. The impedance 
mismatch between the drum and 
stapes footplate thus is an approxi- 
mate function of the degree of fixa- 
tion of the latter. The level of Step 
4 is materially influenced by the pre- 
cision with which the middle ear is 
closed and the drum reapproximated 


POSTOP. AIR CONDUCTION 


‘ 2 3 4 5 





Figure 16. Typical nomographic course in 
inadverent footplate fenestration. Note poor 
nomographic response at Step 4 and early 
closure with loss of operative gain. Note 
also additional drop at Step 4 when gelfoam 
was applied (See arrow in the figure). 


POSTOP. AIR CONDUCTION 





Ficure 17. Nomographic evidence of foot- 
plate fenestra corrected by anterior cruro- 
tomy and monopod lysis. Note improve- 
ment in threshold at Step 3 followed by 
marked drop at Step 4. Re-inspection im- 
mediately revealed a solid fixed anterior 
crus and a footplate fracture. An anterior 
crurotomy and monopod lysis permitted 
ossicular continuity with the posterior foot- 
plate fragment and adequate lysis. 


to its sulcus attachment. It is also re- 
lated to the presence or absence of 
tears or perforations in the tympanic 
membrane. If the threshold at Step 
4 falls on or above the predictive di- 
agonal guide line, the probability of 
post-operative success is great. 

If lysis force has inadvertently pro- 
duced a footplate fenestra with a 
perilymph pool open to tympanic air- 
borne sound, this fact will usually 
show up on the nomograph by com- 
parison of Steps 3 and 4. In such a 
case, there will usually be a very 
adequate gain at Step 3, either directly 
on the predictive line or to its left. 
At Step 4 with the drum closed, how- 
ever, in such a footplate fenestra there 
will be either an inadequate gain or a 
major drop in hearing, as illustrated in 
Figures 16 and 17. If there is an in- 
adequate gain, a small piece of gel- 
foam is placed in the intercrural space. 
If such gelfoam packing reverses the 
relationship and creates a normal rise 
at Step 4, it will probably indicate 





























GOODHILL: STAPEDOLYSIS AND NOMOGRAPH TECHNIC 187 


‘ 
AS 


N gmearly closures 
‘N 


Cases 


- 
~~ 


failed to mobilize 





100 200 300 400 500 
Coase Groups 


Figure 18. ‘Operative failures and early 
closures. In the case group between #300 
and #350, attempt at improving results 
was made by intentional footplate fenes- 
tration. There was an immediate marked 
improvement in operative successes in 
which the number of cases which failed to 
mobilize dropped considerably. However, 
during this identical period, there was a 
very marked increase in early closures as 
noted in the dashed line above. When the 
peribasal approach was substituted for the 
footplate fenestration technic, the good im- 
mediate results persisted, but the early 
closures dropped drastically, thus indicating 
that footplate fenestration technics were 
spurious surgical victories. 


that the footplate fenestra existed 
concomitantly with a mobile mono- 
pod or bipod stapedolysis. Should the 
footplate fenestra exist alone, the gel- 
foam pack will frequently produce a 
temporary drop in threshold at Step 
3 accompanied by a similar drop at 
Step 4. Thus, we see a diagnostic 
value of tl: nomograph in differen- 
tiation between a footplate fenestra 
and a mobile footplate or footplate 
fragment (monopod or bipod stape- 
dolysis). 

It may be of interest to observe the 
fate of a significant number of foot- 
plate fenestrations that were per- 
formed intentionally early in the 
course of this writer’s mobilization ex- 
periences. In 1955, a group of more 





than 50 patients were subjected to 
footplate fenestration in the hope that 
this would solve many of the pre- 
viously insoluble problems. Immediate 
operative successes were indeed high, 
as indicated on the graph in Figure 
18, but post-operative failures were 
extremely frequent and the overall 
average of successes dropped to a 
very low level during this period. 
When the approach was re-directed 
to the stapedial circumference and to 
the area of the stapedial ligament 
with avoidance of footplate fenestra- 
tion, there was a marked increase in 
long range results. 


Surgical Comments. The above-re- 
ported microsurgery is deceptively 
simple and full of anatomic hazards to 
surgeons not intimately familiar with 
the many anomalies of the tympanum. 
To otologic surgeons skilled in fenes- 
tration surgery it is familiar territory; 
but it is still a new type of micro- 
manipulation through a very small 
field, with new problems in land- 
marks and instrumentation. It is pre- 
cise surgery. The procedure may take 
anywhere from 30 minutes to two or 
more hours. 

For the patient, however, this sur- 
gery is not an unpleasant experience. 
It is practically painless. Vertigo may 
occur, but it rarely lasts more than a 
few hours. Prophylactic antibiotic 
therapy is used pre- and post-opera- 
tively. Immediate ambulation is pos- 
sible and patients are discharged on 
the morning following surgery. Im- 
mediate hearing improvement is fol- 
lowed by a temporary drop in acuity 
which is due to edema and packing 
of the canal. The packing is removed 
in six or seven days, at which time 
the hearing improvement returns. 





188 JOURNAL OF SPEECH AND HEARING RESEARCH 


Results 


Post-operative results have been re- 
ported in great detail in previous 
papers, (5, 8). Results reported are 
derived from the primary post-op- 
erative tests given approximately 
two weeks after surgery. The termi- 
nology here employed is the same as 
that used in previous reports, and may 
be defined as follows: 


Thresholds. The values of acoustic 
threshold are stated in terms of the 
equivalent speech reception threshold 
derived from the Fletcher formula (an 
average of the two best threshold re- 
sponses at 500, 1000, and 2000 cps). 
Thus, since audiometric measurements 
vary in five db steps, the equivalent 
SRT values used here usually vary in 
two and one-half db steps. 


No Change. This term is used to in- 
dicate a post-operative air conduction 
level that has not changed by more 
than plus or minus five db with re- 
spect to pre-operative air conduction. 


Losses. Losses are considered to be 
such when the post-operative air con- 
duction level is seven and one-half db 
or more below pre-operative air con- 
duction. 


Partial Gains. Partial gains are gains 
which are seven and one-half db or 
more above pre-operative air conduc- 
tion, but less than those defined as 
‘success.’ 


Success. Dual criteria must still be 
retained for success, since most others 
who report in this field are still using 
the fenestration criteria and thus con- 
sider any post-operative levels within 
30 db of normal as ‘successful.’ How- 
ever, this writer is convi iced that re- 
moval of the air conduction block (or 


STAPES SURGERY "SUCCESSES" — DUAL CRITERIA 











ToTAL PER CENT 
TOO cases TOO cases 
Th 4% . 
BOF ie 388 47% total “successes 
eoTH 248 35% 
TOTAL 306 85% ws 4730 db A.C. oF better 
60 > }—air bone gop closed 
Pte ae (7 Ye db or better) 
-— == Z 27 "|="success” by both 
40 criteria 


-4 

rom -— - - 
= “+ 

ee 


\ 


n 
°o 





























° 





——% ond NUMBER of CASES——— 


100 200 300 400 500 600 700 
CASES in GROUPS of 100 


Ficure 19. Distribution of stapes surgery 
‘successes.’ 


eradication of the bone-air gap) 
should also be considered as a suc- 
cess, regardless of the final air con- 
duction level. Since audiometric 
measurements are not considered ac- 
curate to better than five db, an arbi- 
trary value of seven and one-half db 
below pre-operative bone conduction, 
above which cases are considered to 
be successful, is being used, as before. 
(In general, the only confusion likely 
to occur from this dual definition of 
‘success’ lies in the region where air 
conduction is between 30 and 40 db 
and bone conduction is near normal. 
Most of the successful cases by the 
30 db standard only lie in this area, 
and materially improve the results re- 
ported for such groups. Approxi- 
mately 43 per cent of the successes re- 
ported qualify in both categories. The 
remainder are about equally divided 
between those cases which qualify in 
one category only.) 

Two criteria of stapes surgery ‘suc- 
cesses’ are (1) achievement of 30 db 
(or better) level for speech frequency 
regardless of air-bone gap, and (2) 
closing of air-bone gap within seven 
and one-half db regardless of the 30 
db level. Figure 19 shows the distribu- 
tion of stapes surgery ‘successes’ for 
a total of 700 cases with four clas- 
sifications of successes: (1) successes 
by the first criterion; (2) successes by 
the second criterion; (3) successes by 
































GOODHILL: STAPEDOLYSIS AND NOMOGRAPH TECHNIC _ 189 


both criteria; and (4) total of suc- 
cesses by one or the other criterion. 
The trend as seen in Figure 19 is ap- 
parently, in general, toward a higher 
percentage of successes with success- 
ful groups of 100 cases each. 

It might be interesting to consider 
a simple yet informative method of 
evaluating the efficacy of a surgical 
technic, not only in stapes surgery but 
in any reconstructive middle ear sur- 
gical procedure. With full realization 
of the weaknesses inherent in bone 
conduction audiometry, there is still 
no quantitative method of assessing 
improvment in middle ear function 
other than degree of closure of the 
air-bone gap. Thus, any objective 
measure of surgical efficacy (in con- 
trast to ultimate communication or so- 
cial adequacy of hearing) must in- 
volve this measure of bone-air gap 
changes. 


If the difference between pre- 
operative air conduction and post- 
operative air conduction is divided by 
the pre-operative air-bone gap,—the 
maximum gain which could normally 
be expected, a percentage is then ob- 
tained which indicates the degree to 
which the air-bone gap has been 
closed, and thus the degree of surgi- 
cal efficacy. This percentage can then 
be termed ‘per cent improvement’: 
air conduction gain in db divided by 
bone-air gap in db. It will of course 
be apparent that this method of meas- 
uring surgical efficacy can be applied 
to individual cases in exactly the same 
way that it may be applied to aver- 
ages of 100 or more cases. The ‘per 
cent improvement’ criterion will en- 
able each surgeon to assess various 
technics in his own practice and re- 
port the results of his experience, 
without danger of bias being intro- 
duced by patient selection or similar 
variations of patient parameters. 


Conclusions 


The primary direct surgical ap- 
proach to the oval window has been 
revived in the treatment of otosclero- 
sis. Early reports of results in this 
renaissance of stapes mobilization 
(stapedolysis) surgery are encourag- 
ing. Stapes mobilization is of value in 
profoundly deaf patients who still 
have an adequate bone-air gap. Such 
patients following successful stape- 
dolysis may not achieve unaided hear- 
ing but may be able to utilize a hear- 
ing aid with greater comfort and 
tolerance. There are no age restric- 
tions to stapes mobilization surgery. 

The fenestration operation is still 
an important and valuable part of the 
surgical treatment of otosclerosis, but 
now assumes a secondary role in most 
instances. It is applicable in cases of 
irreversible ankylosis or in cases of re- 
peated post-operative re-ankylosis. 

The stapes mobilization procedure 
is not an all or none operation, and 
is greatly dependent upon audio- 
metric guidance with the help of the 
surgical audiometric nomograph. A 
significant number of successful re- 
sults have been observed as stable 
gains for periods of 24 months and 
longer. Further long range observa- 
tion is essential. Post-operative re- 
ankylosis may occur in at least 15 
per cent of the cases, most of which 
can be successfully re-operated. 

In the most recent group of 100 
cases surveyed, in which the peri- 
basal approach was used exclusively, 
successful immediate results were ob- 
tained in 71 per cent, significant gains 
were obtained in another 20 per cent, 
no changes were recorded in seven 
per cent, and losses were recorded in 
two per cent. 


At the present time, the primary 
treatment of choice for clinical oto- 
sclerosis is transtympanic stapes mo- 





190 JOURNAL OF SPEECH AND HEARING RESEARCH 


bilization (peribasal stapedolysis). Fen- 
estration surgery remains a valuable 
although secondary procedure. 


Summary 


Otosclerotic deafness, which is 
most commonly due to ankylosis of 
the stapedial footplate, is best treated 
by a direct surgical approach to the 
oval window region. This physiologi- 
cal remobilization of the entire middle 
ear mechanism more closely achieves 
normal hearing than the indirect (fen- 
estration) approach. In this very pre- 
cise surgery, reliance is greatly placed 
upon guidance through surgical au- 
diometry. The nomograph guide in 
surgical audiometry has been most 
useful. In the latest series of cases, 
successful immediate post-operative 
results were obtained in 71 per cent 
of cases, with significant gains in 29 
per cent, no changes in seven per 
cent, and losses in two per cent. 


References 


1. Buaxe, C. J., Operation for removal of 
the stapes. Boston Med. Surg. J., 127, 
1892, 469-470. 

2. Fuetcuer, H., Speech and Hearing in 
Communication (2nd. Ed.). New York: 
Van Nostrand, 1953. 

3. Goopnmt, V., Trans-incudal  stape- 
dolysis for stapes mobilization in oto- 
sclerotic deafness. Laryngoscope, St. 
Louis, 65, 1955, 693-710. 

4. Goopuit, V., Surgical audiometry in 
stapedolysis (stapes mobilization). Arch. 
Otolaryng., 62, 1955, 504-508. 

5. Goopuitt, V., Present status of stape- 
dolysis (stapes mobilization). Laryngo- 
scope, St. Loads, 66, 1956, 333-381. 

6. Goopuitt, V., Instruments for trans- 
tympanic surgery, including  trans- 
incudal stapedolysis (stapes mobiliza- 
tion). Trans. Amer. Acad. Ophthal. 
Otolaryng., 60, 1956, 602-604. 


if 


11. 


12. 


13, 


16. 


17. 


18. 


19, 


20. 


21. 


GoopuiLL, V., and Hotcoms, A. L., The 
surgical audiometric nomograph in 
stapedolysis (stapes mobilization). Arch. 
Otolaryng., 63, 1956, 399-410. 


. Goopuitt, V., and Hotcoms, A. L., A 


study of 500 stapes mobilizations. 
aryngoscope, St. Louis, 67, 1957, 515- 


L 
642 


. GUGGENHEIM, L. K., Otosclerosis. St. 


Louis: L. K. Guggenheim, 1935. 


. Gump, S. R., Histologic otosclerosis. 


Ann. Otol., etc., St. 
246-266. 


Hotmeren, G., Zur Chirurgie der 
Otosklerose. Acta Otolaryng., Stockh., 
27, 1939, 338-349. 

Jack, F. L., Remarkable improvement 
in hearing by removal of stapes. Trans. 
Amer. otol. Soc., 5, 1891-1893, 284-302. 
Kesset, J., Uber die Durchschneidung 
des Steigbiigelmuskels beim Menschen 
und tiber die Extraction des Steigbiigels, 
resp. der Columella bei Thieren. Arch. 
Obr., 11, 1876, 199-217. 


ouis, 53, 1944, 


. Lempert, J., Improvement in hearing in 


cases of otosclerosis: a new, one stage 
surgical technic. Arch. Otolaryng., 28, 
1938, 42-97. 


. Lempert, J., and Wotrr, D., Otosclero- 


sis: theory of its origin and develop- 
ment. Arch. Otolaryng., 50, 1949, 115- 
155. 


Mhot, C., De la mobilisation de l’etrier. 

Rev. Laryng., Paris, 10, 1890, 49, 83, 113, 

145, 200. 

Nacer, F. R., Zur Klinic Und Pathol- 

ogischen Anatomie Der Otosklerose. 
cta Otolaryng., Stockh., 27, 1939, 542- 

551. 


Rosen, S., Palpation of stapes for fixa- 
tion: ey procedure to deter- 
mine fenestration suitability in Otoscle- 
rosis. Arch. Otolaryng., 56, 1952, 610- 
615. 

Rosen, S., The development of stapes 
surgery after five years. Arch. Oto- 
laryng., 67, 1958, 129-141. 

Sovroitte, M., Technicques chirurgicals 
nouvelles pour le traitment des surdites 
de conduction. Academie de Medicine, 
Ser. 3, 102, 1929, 674-678. 

Wirrmaack, K., Betrachtungen Zum 
Otoskleroseproblem. Zb/. F. Hals, Nas., 
Obrenheilk., 1, 1922, 1-15. 



































Se eS 








Graduate Theses In 


Speech And Hearing Research - 1956 


Franklin H. Knower 


There were 161 graduate theses titles 
reported in the field of speech and 
hearing research for 1955. Of this 
number 127 were Master’s theses and 
34 were doctorates. The degrees were 
conferred by 45 graduate schools. 
The names of these schools are listed 
alphabetically in the following report 
of titles. The degree involved is stated 
at the head of each list of the names 
of degree recipients. These names are 
arranged alphabetically in each list. 
Each thesis title is numbered. The 
index which follows the listing of 
schools and titles contains a classifica- 
tion of the suggested content of the 
thesis research followed by the title 
numbers involved. Many titles are in- 
dexed in more than one area of sub- 
ject matter. Doctorate dissertation 
numbers in this index are followed by 
an asterisk. 


Titles 


University of Alabama 
M.A. Theses 


1, Aaron, Randel Wilson. The study of 
speech in interpersonal relationships: 
Techniques for analyzing word length 
in conversation. 

2. Beckelheimer, Frances Alice. The 
study of speech in interpersonal re- 
lationships: Analysis of interruptions 
in a group conversation. 

3. Dearstone, Mary Violette. The study 





Franklin H. Knower (Ph.D., University 
of Minnesota, 1933) is Professor of Speech, 
Ohio State University. 


Volume 1, No. 2 


——191—- 


of speech in interpersonal relation- 
ships: Observer agreement in meas- 
uring visible aspects of Sys 

4. Esco, Marjorie B. An analysis of stage 
fright as presented in representative 
American college-level speech texts. 

5. Ferguson, Alice. The study of speech 
in interpersonal relationships: Tech- 
niques for analyzing vocabulary used 
in conversation. 

6. Hilliard, Clinton T., Jr. The study of 
speech in interpersonal relationships: 
Techniques for analyzing qualitative 
aspects of phrasing in a counseling 
situation. 

7. Hudson, Dolores Ann. The study of 
speech in interpersonal relationships: 
An analysis of word length in a group 
conversation. 

8. Neal, Maryella. The study of speech 
in interpersonal relationships: Tech- 
niques of analysis for measurement of 
certain visible aspects of speech. 

9. Shore, Pamelia Floyd. The study of 
speech in interpersonal relationships: 
Techniques for analyzing topics of 
conversation. 

10. von Redlich, Mark Hamilton. The 
study of speech in interpersonal rela- 
tionships: A technique for the analysis 
of visible aspects of speech. 


University of Arizona 


M.A. Thesis 

11. Slaughter, Alan. A study of the 
phonemic aspect of Bilingualism in 
Papago Indian children. 

Ball State Teachers College 

M.A. Thesis 

12. Greenwell, Olive Mae. Workbook ma- 
terial for language development for 
deaf children. 

Boston University 
M.Ed. Theses 
13. Berarducci, Joanne. Original stories 


June 1958 





192. JOURNAL OF SPEECH AND HEARING RESEARCH 
for teaching the vocal skills in speak- the articulation, voice quality and in- 
ing and listening. telligibility of cleft-palate speech. 

14. Church, Rose. A manual of illustrative MS. Th 
lessons of the Mueller-Walle method i sree . 2 
correlating speech reading with cur- 27. Havens, Ethel T. An experimental in- 
ricular materials. vestigation of speech oh gaaee 

15. Des Landes, Marie. Illustrative lessons among hard-of-hearing children. 
correlating lipreading and auditory 28. Klapp, Carolyn E. A survey of the 
training with classroom materials in general knowledge of physicians con- 
the junior and senior high schools. cerning the field of speech correction. 

16. Joyce, Shiela. A word list for artic- 29. Parobeck, Donna. An investigation of 
ulation testing and practice classified the utility of Wood’s articulation in- 
by speech sounds and arranged by dex as an independent measure of ar- 
reading grade levels. ticulation proficiency. 

17. Knox, Barbara. A review of literature 30. Stroud, Robert Vernon. A study of 
concerning methods for detecting the relations between social distances 
deafness of a non-organic nature. end speech differences of white and 

18. Meseth, Marilyn. A course of study negro high school students of Dayton, 
for developing voice and articulation Ohio. 
on a second grade level. Brigham Young University 

ne ee, Te cae ee 
ascertaining current techniques used in 31, Atkins, Floy Daun. The pe of 
stuttering therapy. three Rcagise a a sy stg 

20. Sahlberg, Richard. A survey of voice children. ee 

ij : ! : 
a. ee peg eer be gpibovsee 32. McCandless, Gary. A study of the 
New England dialect regions. relative effects of different mental age 
21. Watkins, Marilyn. The construction eee eee 
’ eeadl Sepahiela ay f f speech intelligibility of a group of in- 
ee ee ee ee stitutionalized mentally deficient chil- 
ascertaining current techniques used dren 
in stuttering therapy. j 
Dd. Thesis Brooklyn College 

22. Koch, Albert. A comparative study M.A. Theses 
of auditory thresholds of spastic cere- 33. Cooper, June. The influence of type 
bral palsied adults and non-handi- of nonfluency on the diagnosis of 
capped adults as measured by standard stuttering in children. 
audiometric and psychogalvanic skin 34. Gardener, Harvey J. Music in re- 
resistance. laxation for speech with cerebral 

' a palsied children. 
Bowling Green State University 35. Katz, Herman G. Tests of language 
M.A. Theses and abstraction in Aphasia. 

23. Ailing, Karl Edwin. A comparison of 36. Kolatch, Selma. Recognition of let- 
ascending and descending thresholds ters, words and phrases in visual 
as obtained by two special methods of gyre ae 
limits testing programs. 37. Lee, Wanda W. Speech rehabilitation 

24. Canter, Gerald J. An experimental in- as seen from the point of view of the 
vestigation of certain — of the 38 pe sod J Th 
nature of phonemic discrimination : liability ai 8 Psd eh 
ability in children with functionally ea a, a ee 
defective articulation and some of its 39 Mead o William Phebe 
possible correlates. 2 ie “a illiam F. Hearing in ‘mul- 

: Fane iple sclerosis. 

25. Kasten, Roger N. An Investigation of 40, Pollins, Judith L. Tests of concept 
the relationship between stuttering and formation of the mentally retarded. 
the inability to monitor speech au- 4], Rahman, Persephone. The self-concept 
ditorially. and ideal self-concept of stutterers as 

26. Pinson, Agnes Bell. An experimental compared to non-stutterers. 
study of the palatal efficiency, and 42. Rees, Pauline. A study of the relation- 























—EEE 











ship between age of operation and 
occurrence of hearing loss with cleft 
palate. 

43. Reiner, Karol S. The influence of 
memorization on frequency of stutter- 
ing and adaptation. 

44. Rivlin, Ada. The effects of bulbar 
poliomyelitis on speech. 

45. Shaw, Beverly G. Hearing losses in 
an urban speech and hearing clinic: 
Their types and etiologies. 


Cornell University 
M.A. Thesis 
46. Gershberg, Myron Ross. The role of 


the early parent-child relationship in 
the etiology of stuttering. 


University of Denver 
M.A. Thesis 
47. Carroll, Philip S. A comparison of 
rhetorical, psychological and mathe- 
matical studies on the nature and use 
of information. 


Ph.D. Thesis 

48. Brasell, Harold. A method to aid 
speech therapists in making a prog- 
nosis. 


University of Florida 

M.A. Thesis 

49. Kasan, Emil Albert Lee. A study of 
the misarticulation of [1] sounds in 
children from kindergarten through 
third grade. 

Ph.D. Thesis 

50. Strauss, Raymond Bernard. An in- 
vesigation of the effect of mephenesin 
carbamate (tolseram) on normal hear- 
ing thresholds as determined by the 
conditioned psychogalvanic skin re- 
sponse and conventional pure tone au- 
diometry. 


University of Illinois 
M.A. Thesis 
51. Miron, Murray Samuel. The conso- 


nant-vowel sound pressure ratio in the 
spoken syllable. 


Ph.D. Theses 

52. Grubb, Patti Murray. A psychophys- 
ical study of vowel formants. 

53. Kurtzrock, George Henry. The ef- 
fects of time and frequency distortion 
upon word intelligibility. 

54. Nagel, Robert Francis. An evalua- 
tion of the ear choice techniques as 


KNOWER: GRADUATE THESES 193 


a method of measurement of auditory 
acuity: a group technique compared 
with an individual technique. 


Indiana University 
M.A. Theses 


55. Cohen, Harriet L. Effect of long ver- 
sus short latency between stimulation 
and response when training unfamiliar 
sounds. 

56. Giolas, Thomas G. Reactions of chil- 
dren to specific nonfluencies in the 
speech of adults. 

57. Suematsu, Kikuyo. Effect of training 
on different levels of articulation skills 
in mentally retarded children. 


Ph.D. Thesis 


58. Robinson, Edward R. An experimental 
investigation of certain commonly 
suggested teaching methods for the 
development of confidence in begin- 
ning students of public speaking. 


State University of Iowa 
M.A. Theses 


59. Dew, Donald. A preliminary investi- 
gation of the perceptual characteristics 
of inter-phonemic transitions. 

60. Hood, William Hamilton. A study of 
the effectiveness of the masking sig- 
nals of three commercial audiometers. 

61. Kools, Joseph Anthony. Speech non- 
fluencies of stuttering and non-stutter- 
ing children. 

62. Noll, John Douglas. An analysis of 
glottal stops in the speech of children 
with cleft palates and of children 
with functional defective articulation. 

63. Winitz, Harris. A quantitative study 
of the repetitions in children’s speech 
in the second year of life. 


Ph.D. Theses 

64. Heinberg, Paul Julius. An experimen- 
tal investigation of methods of meas- 
uring diction. 

65. Skalbeck, Oliver M. The relationship 
of expectancy of stuttering to certain 
other designated variable association 
with stuttering. 


Kent State University 
M.A. Thesis 
66. Kolas, Christy. The effect of dental 
malocclusion on the production of 
consonant sounds of elementary school 
children. 





194. JOURNAL OF SPEECH AND HEARING RESEARCH 
Louisiana State University the oral motor response in cerebral 
palsied children. 
pig Pacis 80. Hughes, Joan Lee. An analysis of 
67. Avant, Velma. A passage for speech visual non-verbal and visual verbal 
screening 1n the elementary grades. perception in regard to reading dis- 
68. Maraist, Jean. The effects of auditory abilities in dysphasics. 
masking upon the speech of stutterers. 1, Roderick, Margaret C. An experi- 
69. Warren, Margaret. The organization mental method for testing the articula- 
of cleft palate teams. tion of blind children. 

Ph.D. Thesis 82. ee bilge C. ere sen- 
70. Nelson, Agnes Denman. A study of i mee io = apen with functional ar- 
the English Speech of the Hungarians arrestee lo ga : 

of Albany, Livingston Parish, Loui- 83. Snyder, Jack M. A study of diado- 
siana. chokinesis of the lips, tongue and 
palate of adults with non-defective 
Marquette University speech. ; 
M.A. Thesi 84. Traub, Barbara L. The designs of a 
Sapiaaita film to introduce speech reading to 
71. Duggan, Susan Ann. A compendium acoustically handicapped and normal 
of current methods of speech therapy hearing adults. 
in cerebral palsy, gathered from  g5, Vanduzer, Marianne L. A study of 
American Professional Journals and an severely handicapped cerebral palsy 
evaluation of them in terms of ac- children in a coordinated therapy pro- 
cepted theories of therapy. gram. 
University of Maryland Michigan State University 
M.A. Theses M.A. Thesis 
72. Elkins, Earleen. Effects of side-tone  g¢ Asuncion, Nobleza Castro. A study of 
delay on oral reading responses under English sounds difficult for Filipino 
conditions of binaural and mon-aural students. 
stimulus presentation. 
73. Inn, Evalyn. Use of speech intel- University of Mississippi 
ligibility testing in the analysis of ar- 444. Thesis 
ticulation problems of students with 87. Smith, F ed A ; 
Spanish language backgrounds. inci acs Mid “paca acct gee 
é : analysis of an approach to the study 
74. Oshrine, Marsha. An analysis of re- of the development of the speech and 
lationships between measurement of hearing sedtiaiion in the United 
reading skills, speech performances, States. 
speech attitudes and measurements of 
oral reading rate under conditions of Mississippi Southern College 
delayed side-tone. WA. Th 
75. Ruhm, Howard. Pitch discrimination “~“”" sas = A 
as a function of stimulus duration in 8°: ae a niger il 
tive hearing loss. 
seit ial tia Mississippi Southern College Speech 
University of Michigan agg ong He 1949 to pe 
ts 89. Collins, Allie C. Speech patterns o: 
sigs oil selected residents of North Biloxi, 
76. Barnes, Odell W. The course of Mississippi. 
development of articulatory speech de- 99 Gruber, Leslie. An experimental in- 
fects found among Negro children in vestigation of the effect of severity 
ro oe schools of North of stuttering upon listener judgments 
scaReR of the suitability of individuals for 
V6 Bohannan, Barbara i AR A non-verbal various types of employment. 
hearing test for children. 4 91. Powell, Mary Louise and Jewel Jo- 
78. Fransworth, Grover iP The manipula- anne Red. A survey of the speech 
tion of single word units as a measure problems of the first four grades of 
of progress in aphasia therapy. the white rural school of Forrest 
79. Hargrove, Joy Lorraine. Evaluating County, Mississippi. 














Mount Holyoke College 
M.A. Thesis 


92. Hirsch, Marianne. A clinical study 
of five children with articulation prob- 
lems: With emphasis on the method 
of presenting sounds. 


Northwestern University 
Ph.D. Theses 


93. Batza, Eugene Mann. Investigation of 
the speech and oral language behavior 
of educable mentally retarded chil- 
dren. 

94. Bzoch, Kenneth Rudolph. An investi- 
gation of the speech of pre-school 
cleft palate children. 

95. Counihan, Donald Thomas. A clinical 
study of the speech efficiency and 
structural adequacy of operated ado- 
lescent and adult cleft palate per- 
sons. 

96. Rosenberg, Philip Emanuel. The in- 
fluence mI stimulus duration upon dif- 
ferential intensity sensitivity in nor- 
mal and impaired ears. 

97. Starr, Clark Dean. A study of some 
of the characteristics of the speech 
mechanism of a group of cleft palate 
children. 

98. Subtelny, Joanne Davis. A  lamina- 
graphic study of nasalized vowels 
produced by cleft palate speakers. 

99. Verlaine, Oscar Usher. The non-ver- 

bal interview: A clinical study of 

wordless communication with schizo- 
phrenics, non-psychotic adults and 
children. 

Wilson, Frank Boyd. A study of the 

effect of a superimposed respiratory 

pattern on the breathing and speech 
of eight athetoid children. 


100. 


Occidental College 
M.A. Thesis 


101. Challgren, Patricia. A systematic ap- 
proach for the teaching of dialects for 
oral interpretation and acting. 


Ohio State University 
M.A. Theses 


102. Baker, Donald Jessup. An experimen- 
tal investigation of the use of bone 
conduction as a standard communica- 
tion channel. 

103. Class, Lois Wunker. A comparative 
study of normal speakers and speech 
defectives with regard to the tactual- 
kinesthetic perception of form with 
the tongue. 


KNOWER: GRADUATE THESES 195 


104. Doudna, Mark Eugene. An analytical 
study of a multiple-tone pulse type 
group auditory screening test. 

105. Fleeman, Carolyn Sue. A survey of 
noise levels and associated hearing 
losses found in a plastics factory. 

106. Hesse, John Frederick. A study of 
the relationship between parental at- 
titudes and the severity of articulation 
defectiveness in the speech of chil- 
dren. 


Ph.D. Theses 


107. Alluisi, Mary Boyle. The relationship 
between vocal characteristics in oral 
reading and the relative information 
of selected phrases. 

108. Hendricks, Richard. Relationships 
among tests of intelligibility, word re- 

ception and other measures of sym- 

bolic formulation. : 

McCroskey, Robert Lee, Jr. The ef- 

fect of speech on metabolism: A com- 

parison between stutterers and non- 
stutterers. 

110. Stromsta, Courtney Paul. A method- 
ology related to the determination of 
the phase angle of bone-conducted 
speech sound energy of stutterers and 
non-stutterers. 

111. Worthington, Anna May Lange. An 
investigation of the relationship be- 
tween the lipreading ability of con- 
genitally deaf high school students 
and certain personality factors. 


109. 


Ohio University 

M.A. Theses 

112. Falk, Marvyn Lee. A study of the 
speech problems of selected fourth 
grade students in Wood County, West 
Virginia. 

113. Ronan, Alice Joan. An exploration of 
the relationship between the number 
of years an individual has been stut- 
tering and his attitude toward stutter- 
ing. 

University of Oklahoma 


M.A. Thesis 


114. Tillman, Tom Whitten. The effects 
of occlusion and masking on specified 
bone conduction thresholds in hard- 
of-hearing subjects. 


College of the Pacific 
M.A. Thesis 
115. Gleeson, Katherine E. A study of 








196 JOURNAL OF SPEECH AND HEARING RESEARCH 


child welfare services in San Francisco 
unified school district. 


Pennsylvania State University 


MS. Thesis 

116. Brooks, Harry P., Jr. The relation- 
ship between auditory discrimination 
loss and the ability to hear consonant 
errors in specch. 

University of Pittsburgh 
MS. Theses 
117. Monteverde, Jean Marie. An investiga- 


tion of response of a special audience 
to a series of programs presenting lip- 
reading by television. 

118. Pellegrini, Mario. An exploration of 
speech therapy procedures reported 
helpful and not helpful by clients and 
clinicians. 


Ph.D. Thesis 


119. Varva, Frank. An investigation of the 
effect of auditory deficiency upon per- 
formance with special reference to 
concrete and abstract tasks. 


Purdue University 
Ph.D. Theses 


120. Green, David Samuel. Fundamental 
frequency characteristics of the speech 
of profoundly deaf individuals. 

121. Ham, Richard Errol. Certain effects 
on speech of alterations in the auditory 
feedback of speech defectives and 
normals. 

122. Spuehler, Henry Ernst. Effects and 
interactions of delayed sidetone and 
auditory flutter. 


Queens College 
M.A. Thesis 


123. Arenwald, Helen G. The effectiveness 
of speech training for a group of six 
children with cerebral palsy. 


University of Redlands 
M.A. Theses 


124. Aten, James L. An investigation of 
the relationships between intelligence 
and lipreading ability among deaf 
children. 

125. Borghi, Eugene. A study comparing 
the basic personalities of the mothers 
of stuttering sons with mothers of 
non-stutterers as measured by the 
MMPI. 

126. Borghi, Robert W. A study of the 


reaction time of stutterers and non- 
stutterers to verbal stimuli. 

127. Riedman, Richard. A study of the re- 
lation between accuracy of articula- 
tion and speech intelligibility of resi- 
dential school deaf children. 

128. Stalcup, Edsel L. A study of the 
theory as related to the practice in 
speech correction in the State of Cali- 
fornia. 


University of Southern California 
Ph.D. Theses 


129. Hamilton, William Wallace. An ex- 
perimental investigation of the effects 
of phenobarbital on stage fright. 

130. Harrison, Peggy R. An experimental 
study by X-ray analysis of some 
resonator adjustments in efficient and 
inefficient voice production in low- 
pitched male voices. 

131. Malamuth, Leo Goodman, II. An ex- 
perimental study of the effects of 
speaking rate upon listenability. 

132. Swenson, George F. An experimental 
study of the relationship of parental 
attitudes to functional disorders of ar- 
ticulation in children in two different 
cultural environments. 


Southern Illinois University 
M.A. Theses 


133. Berg, Fred S. The relationship of nasal 
emission to other factors in cleft- 
palate speech. 

134. Mosley, Lloyd T. The effects of 
speech therapy on the scholastic 
achievement of stuttering children. 

135. Perdomo, Dorothy. A comparison of 
four evaluative measures of speech ob- 
tained in a summer camping program 
for crippled children. 


Stanford University 
M.A. Theses 


136. Daly, Catherine Nixon. A survey of 
the social and psychological needs of 
an adult speech reading group. 

137. Owens, Earl Richard. Factors limitin 
speech progress in postoperative cleft 
palate cases. 

138. Ray, Beverly Virginia. Therapy for 
the adolescent stutterer. 

139, Ventry, Ira M. Occupational deafness: 
Its causes and prevention. 


Ph.D. Thesis 


140. Dawson, Warren Robert. Inter-tester 
variability in tests of children: A com- 


























parison of the pulse-tone and stand- 
ard techniques of pure tone audio- 


metry. 
Syracuse Universitv 


Ph.D. Thesis 


141. Feldman, Alan S. An investigation of 
several auditory effects of the Fenes- 
tration Operation for otosclerosis. 


University of Utah 
M.S. Thesis 


142. Shelton, Ralph LeMar. Figure draw- 
ing as an instrument for the diagnosis 
of stage fright. 


Vanderbilt University 
MS. Theses 


143. Drennan, Dorothea. An investigation 
of the relationship of articulatory abil- 
ity to social maturity in the four year 
old child. 

144. England, Gene. An investigation of 
the relevance of vision to the develop- 
ment of articulatory and sound dis- 
criminatory abilities as measured in 
individuals with severe visual impair- 
ment. 

145. Meredith, Jean. A comparison of bi- 
naural and monaural presentation of 
the Doerfler-Stewart test. 

146. Perry, Doris. An investigation of the 
relationship between auditory discrim- 
ination ability and articulation ability 
in a sample of four year old children. 

147. Perry, Effie K. Pure tone thresholds 
measured by two audiometric methods 
for a group of mentally deficient 
children. 

148. Ryberg, Merle S. A comparison of 
various objective and subjective meas- 
ures of the speech disturbance of 
stutterers. 

149. Sewell, Johnny. An experimental com- 
parison of objective and subjective 
methods of evaluating acceptable 
speech performance. 

150. Templeton, Martha. A comparative 
study of the oral and written language 
of hearing impaired children. 


University of Virginia 
M.A. Thesis 


151. Meffret, Molly Lou. The effect of 
serpasil (reserpine) on the severity 
of stuttering. 


KNOWER: GRADUATE THESES 197 


University of Washington 
M.A. Theses 


152. Brungard, Jacqueline. An experimental 
investigation into the use of PGSR as 
an adjunct to delayed side tone. 

153. McIntosh, Donald C. A study of the 
breathing patterns of cerebral palsied 
subjects. 


Western Reserve University 
Ph.D. Thesis 


154, Bender, Ruth E. A history of the 
education of the deaf: From its be- 
ginning to 1880. 


University of Wichita 

M.A. Theses 

155. Grover, John M. A study comparing 
adaptation to stuttering with adapta- 
tion to auditory delayed speech feed- 
back. 

156. Rundle, Foster W. A study of the 
problems in the organization and op- 
eration of field centers under the in- 
stitute of logopedics. 


University of Wisconsin 
M.S. Theses 


157. Crocker, Patricia C. A study of the 
articulatory development of first and 
second grade children using original 
black and white photographs as stim- 
uli. 

158. Groh, Raymond P. A_palatographic 
study of certain sounds of the Japanese 
language. 

159. Mahan, Allison L. A study of the ar- 
ticulatory development of nursery 
school and kindergarten children using 
original black and white photographs 
as stimuli. 

160. Suliver, Marjorie E. Development and 
analysis of the speech-pattern stimula- 
tion workbook for children. 


University of Wyoming 
M.A. Thesis 


161. Larson, Robert L. A study of the ef- 
fect of immediate binaural feedback 


on the speech behavior of two 
stutterers. 
Index 


Aphasia: language test 35; reading per- 
ception in 80; recognition of letters and 
words in 36; therapy 78. 


Articulation and Pronunciation: bilingual- 








198 JOURNAL OF SPEECH AND HEARING RESEARCH 


ism 11; children’s 49, 92, 106; consonants 
66, 116; consonant-vowel pressure ratio 51; 
course of study 18; of the deaf 127; of Fil- 
ipinos 86; glottal stops 62; and hearing 146; 
of Hungarians in Louisiana 70*; inter- 
phonemic transition 59; Japanese sounds 
158; in mentally retarded 32, 57, 93*,; Negro 
76; and palatal efficiency 26; parental at- 
titude 132*; phonemic discrimination 24; 
roprioceptive sensibilities 82; teaching dia- 
ects 101; testing 16, 29, 31, 64, 67, 73; 
therapy 157, 159; vowels 52*, 98*. 


Cerebral Palsy: auditory thresholds 22*; 
breathing 100*; music therapy 34; nasal 
emission 133, oral motor responses 79; 
therapy 34, 71, 123. 


Children: articulation 49, 76, 92, 106, 132*, 
144, 146; articulation testing 16, 81; bi- 
lingualism in Papago Indian 11; blind 81; 
cerebral palsy 34, 79, 85, 100*; cleft palate 
62, 94*, 97*; concept formation 40; conso- 
nants 66; crippled 135; deaf 12, 124; glottal 
stop 62; hearing 77, 140*; mentally deficient 
32, 57, 93*, 147; parental attitude 132*; 
phonemic discrimination 24; program for 
160; proprioceptive sensibilities 82; reaction 
to nonfluencies 56; sight 144; social ma- 
turity 143; speech perception 27; stuttering 
33, 61, 63, 134; wordless communication 
99*. 

Cleft Palate: and glottal stop 62; hearing 
loss 42; intelligibility in 26; nasalized vowels 
of 98*; operated persons 95*, 137; pre- 
school 94*; speech mechanism in 97*. 


Feedback and Sidetone: auditory flutter 
122*; of defectives and normals 121*; and 
oral reading 72, 74; use of PGSR 152*, stut- 
terers 155, 161. 


Hearing: bone conduction 102, 110*, 114; 
and cleft palate 42; deafness 12, 111*, 127; 
factory noise 105; fenestration operation 
141*, language development and 12, 150; 
lip reading 117, 124; loss 116; and mephene- 
sin carbamate 50*; in multiple sclerosis 39; 
occupational deafness 105, 139; and per- 
formance 119*; pitch discrimination 75, 120*; 
sensitivity 96*; social maturity 143; speech 
perception 27; speech reading 14, 15, 84, 
136; and stuttering 25, 68, survey 45; test- 
ing 17, 22*, 23, 54*, 60, 77, 104, 140*, 145, 
147. 


Information Theory: studies 47; vocal 
characteristics and 107*. 

Intelligibility; and articulation of the deaf 
127; with cleft palate 26; and mental de- 
ficiency 32; rate and 121; with Spanish 
background 73; time and frequency dis- 
tortion 53*; and word reception 108*. 

Language: in aphasia 35; and hearing 150, 
of Louisiana Hungarians 70*; vocabulary 
5; word length 1, 7. 

Miscellaneous: crippled children 135; di- 
adochokinesis 83; interruptions 2; maloc- 
clusions 66; prognosis 48*; tactual kinesthe- 
tic perceptions 103; speech problems 112, 
topics 9. 

Personality: and deafness 111*; and lip 
reading 124; schizophrenics 99*; social dis- 
tance 30; stage fright 4, 58*, 129*, 142. 

Poliomyelitis: bulbar 44. 


School Programs: in articulation 157, 159; 
California 128; for the deaf 154*; Institute 
of Logopedics 156; manual for 14; Mis- 
sissippi Southern College 88; San Francisco 
115; summer camping 135. 


Stutterers: adaptation of 38, 43, 155; au- 
ditory masking 68; and bone conduction 
110*; children’s reaction to 56; diagnosis 
33; employment 90; expectancy 65; and 
feedback 161; metabolism 109*; and moni- 
toring speech 25; mothers 125; non-fluencies 
61, 63; parent-child relationships 46; re- 
action time 126; self-attitudes 113; self-ideal 
concept 41; and serpasil 151; survey form 
19, 21; testing 148; therapy for 19, 134, 138. 

Surveys: California 128; hearing 45; Miss- 
issippi 89, 91; of physicians on speech cor- 
rection 28; stutterers 19, 21; therapy 118; 
West Virginia 112; voice 20. 

Teachers: speech and hearing 87. 

Tests: aphasia 78; articulation 16, 29, 31, 
67, 73, 81; concept formation 40; diction 
64; speech performance 149; stutterers 38, 
148; visible aspects 3, 8, 10. 

Therapy: aphasia 78, cerebral palsy 34, 
71, 85, 123; cleft palate 69, 137; films for 
84; lip reading 117; prognosis 48*; stutter- 
ing 19, 134, 138; surgeon’s point of view 
37; survey 118, workbook for 160. 

Voice: and information 107*; phrasing 
6; pitch 75, 120*; quality 26; rate 131, 
resonator adjustment 130*; teaching 13, 18. 











Book Reviews 


Basowitz, Harotp; Persky, Harotp; Kor- 
CHIN, SHELDON J. and GrinKLEeR, Roy R. 
Anxiety and Stress. New York: McGraw- 
Hill Book Co., 1955. Pp. 320. $8.00. 


The purposes of the experiments reported 
in this book were ‘to study a group of 
young and healthy men as they underwent 
the stresses of paratroop training, to assess 
the changes in psychological and _bio- 
chemical functioning in this situation in- 
dependently and as they were related, and 
to evaluate the particular changes as a func- 
tion of initial personality.’ The authors’ state 
that although ‘it is difficult to establish 
sharp distinctions between fear and anxiety,’ 
they would define anxiety as ‘the conscious 
and reportable experience of intense dread 
and forboding, conceptualized as internally 
derived and unrelated to external threat.’ 

These studies were conducted at the Air- 
borne Department of the Infantry School, 
U. S. Army, at Ft. Benning, Georgia. The 
men in paratroop training are put through 
an intensive three-week training course, 
terminating in each student making five 
jumps from an aircraft in flight, from al- 
titudes of 1,000 and 1,200 feet. ‘Over and 
above the jumps themselves, the entire 
training period is conducive to a fairly high 
level of psychological tension. The atmos- 

here is serious and businesslike,’ discipline 
Is more rigorous than in the usual army 
unit, and individuals ‘who become im- 
mobilized through fear and refuse to jump 
from the mock tower . . . or from an air- 
craft in flight . . . are termed “quitters.”’ 
Thus the entire training period is considered 
to be ‘a continuous stress situation,’ 

Eight experiments with eight groupings 
of subjects are reported in detail in this 
book. Space will permit a description of 
only one of the experiments, and a discus- 
sion of the results of several. Phase 1, Ex- 
periment 1 (Group 1)—Subjects: 30 men 
out of a starting 700, ages 17-22 years, with 
AGCT scores ranging from 61-125 (mean 
102). Procedure: Clinical: (1) psychiatric 
interview; (2) psychological—anxiety ques- 





Book Reviews is edited by Ernest H. 
Henrikson, University of Minnesota. 


Volume 1, No. 2 


——199—_ 


tionnaire, tachistoscopic Bender - Gestalt 
Test, anxiety self-rating, serial subtraction, 
tachistoscopic closure (for designs), and 
memory for digits; Biochemical: (1) hip- 
puric acid tolerance test; (2) blood plasma 
glycine level; (3) blood plasma amino acid 
level; (4) blood reduced glutathione level; 
and (5) blood eosinophil level. Also em- 
ployed in the analysis of results were per- 
formance records, personal data and be- 
havior observations. 

Other groups studied in subsequent ex- 
periments were 69 inen who had refused to 
jump and who thus presumably demon- 
strated less tolerance for stress, the 10 high- 
est and 10 lowest in hippuric acid excre- 
tion, those who varied in glutathione, etc. 

Hippuric acid excretion was measured 
through chemical analysis of the urine and 
expressed in terms of grams of hippuric 
acid. excreted per hour. Blood eosinophil 
level was measured by counting eosinophil 
cells, ‘in blood collected in oxalated bottles 
just before and four hours after a particular 
stress. Hippuric acid excretion, according 
to the authors, ‘is known to be an index 
of the amount of free anxiety, and blood 
eosinophil level is known to be a measure 
of adreno-cortical response to stress. ‘ 
by contrast, the reduction of eosinophil 
cells circulating in the blood stream reflects 
a wide variety of physical and psychologi- 
cal stresses. . . . Therefore, the joint meas- 
urement of blood eosinophil level and hip- 
puric acid production may help to define 
the nature and severity of the body’s re- 
actions to stress.’ 

Among the results of these studies may 
be cited the following: (1) ‘Subjects high 
in hippuric acid excretion were consistently 
more anxious during training, more variable 
in their ratings, and their anxiety seemed 
less related to the realistic dangers of the 
training situation.’ 

(2) ‘For all men—a greater anxiety of 
failure is expressed than of harm.’ It ap- 
pears that ‘shame-anxiety is the more readily 
evoked, but it is in the amount of harm- 
anxiety that the more disturbed groups are 
distinguished.’ 

(3) Those subjects who refused to jump 
and were least able to cope with the train- 


June 1958 





200 JOURNAL OF SPEECH AND HEARING 


ing situation, revealed more severe bio- 
chemical stress response. 

(4) ‘Metabolic indices seem to differ in 
their sensitivity to stress effects. At the 
most sensitive extreme are the eosinophils, 
the mosc stable is hippuric acid, in inter- 
mediate are the remaining measures studied.’ 

(>) iss increase in anxiety led to a 
decrement in psychological performances in 
the high groups, but a similar increase im- 
proved functioning in the low hippuric 
acid subjects—further evidence of the facil- 
itative effect of anxiety on the less sensitive 
and less anxious subject.’ 

(6) ‘Those who performed worse but 
passed [the course] . . . seemed to be re- 
sponding more to the reality of the situation 
and showing greater concern over the 
threat of failure. On the other hand, fail- 
ing subjects were more greatly disturbed 
and manifested the more profound and 
more neurotic anticipation of being hurt.’ 

The authors state frankly that some of 
their results are not strongly substantiated 
by statistical significane. They feel, how- 
ever, that certain trends reported should 
be provocative of further research. Al- 


RESEARCH 


though these studies are not of general in- 
terest to speech pathologists and therapists, 
there is a suggestion that the anxiety of the 
stutterer might be investigated through the 
procedures used in these experiments. To 
this writer’s knowledge no_ biochemical 
studies employing the specific tests de- 
scribed in this book have been reported in 
the literature. Two ‘implications of the re- 
sults’ are possibly of interest to those of 
us who are concerned with stuttering. (1) 
. . we can reasonably postulate a con- 
tinuum of events ranging from situations 
which evoke anxiety in everyone to those 
which are snbaciighilly unique for the in- 
dividual. (2) Our present answer indicates 
that personalities have pre-dispositions to 
various types of anxiety based on past ex- 
perience, but that the structure of social 
groups usually emphasizes success and hence 
stirs up shame-anxiety. Increasing success 
may neutralize this type of anxiety, and 
failures accentuate it to the point of dis- 
rupting even available personal skills. 


Frank P. Bakes 
University of Pennsylvania 





